Exploring Large Codebases

Posted on 25 May, 2021

Since more than 50% of the time developers spent reading & modifying existing source code rather then writing new one, reading code is an essential skill for a developer that's why I am maintaining a log of resources, tips etc. to explore unknown Codebases

Usually picking a small task will help explore the codebase but ofc as a developer we need some powerstuff as well.

  • Write a glossary (class/function names, file/variable prefixes), As you read the code and encounter terms/words you don't know, write them down. Try to explain what they mean and how they relate to other terms. Create Docs for yourself

  • Check commit log in general for some interesting commits.

    • Check most commonly edited files

  • Read and run the tests (if any). The tests are (usually) a very clear and simple insight into otherwise complex functionality. This method should do this, this class should do that etc.

  • Read the documentation and comments (if any). This can really help you understand the how's and why's.


  • When exploring a git repo add these aliases to find more insights like

    • git top10: most actively edited files.

    • git his: Examples of commits that modify a file (helps in understanding what needs to be changed)

    • git wot: See how a function evolved

          # find commits that changed a file: git his <filepath>
          his = log --follow --color=always --date=format:'%d %b, %Y' --pretty=format:'(%Cgreen%h%Creset)[%ad] %C(blue bold)%s%Creset'
          # search code in commit history: git wot :function_name:filepath
          wot = log --date=format:'%d %b, %Y' --pretty='%n%C(yellow bold)📅️ %ad%Creset by (%C(green bold)%an%Creset) %C(cyan bold)%h%Creset' --graph -L
          # top 10 most edited files
          top10 = ! git log --pretty=format: --name-only | sort | uniq -c | sort -rg | head -10

Resources & Internet Threads

Last updated