Exploring Large Codebases
Posted on 25 May, 2021
Since more than 50% of the time developers spent reading & modifying existing source code rather then writing new one, reading code is an essential skill for a developer that's why I am maintaining a log of resources, tips etc. to explore unknown Codebases
Usually picking a small task will help explore the codebase but ofc as a developer we need some powerstuff as well.
- Write a glossary (class/function names, file/variable prefixes), As you read the code and encounter terms/words you don't know, write them down. Try to explain what they mean and how they relate to other terms. Create Docs for yourself
- Check commit log in general for some interesting commits.
- Check most commonly edited files
- Read and run the tests (if any). The tests are (usually) a very clear and simple insight into otherwise complex functionality. This method should do this, this class should do that etc.
- Read the documentation and comments (if any). This can really help you understand the how's and why's.
Tools
When exploring a git repo add these aliases to find more insights like
git top10
: most actively edited files.git his
: Examples of commits that modify a file (helps in understanding what needs to be changed)git wot
: See how a function evolved
[alias] # find commits that changed a file: git his <filepath> his = log --follow --color=always --date=format:'%d %b, %Y' --pretty=format:'(%Cgreen%h%Creset)[%ad] %C(blue bold)%s%Creset' # search code in commit history: git wot :function_name:filepath wot = log --date=format:'%d %b, %Y' --pretty='%n%C(yellow bold)📅️ %ad%Creset by (%C(green bold)%an%Creset) %C(cyan bold)%h%Creset' --graph -L # top 10 most edited files top10 = ! git log --pretty=format: --name-only | sort | uniq -c | sort -rg | head -10