Links

Exploring Large Codebases

Posted on 25 May, 2021
Since more than 50% of the time developers spent reading & modifying existing source code rather then writing new one, reading code is an essential skill for a developer that's why I am maintaining a log of resources, tips etc. to explore unknown Codebases
Usually picking a small task will help explore the codebase but ofc as a developer we need some powerstuff as well.
  • Write a glossary (class/function names, file/variable prefixes), As you read the code and encounter terms/words you don't know, write them down. Try to explain what they mean and how they relate to other terms. Create Docs for yourself
  • Check commit log in general for some interesting commits.
    • Check most commonly edited files
  • Read and run the tests (if any). The tests are (usually) a very clear and simple insight into otherwise complex functionality. This method should do this, this class should do that etc.
  • Read the documentation and comments (if any). This can really help you understand the how's and why's.

Tools

  • ack
  • ag
  • codetour
  • When exploring a git repo add these aliases to find more insights like
    • git top10: most actively edited files.
    • git his: Examples of commits that modify a file (helps in understanding what needs to be changed)
    • git wot: See how a function evolved
    [alias]
    # find commits that changed a file: git his <filepath>
    his = log --follow --color=always --date=format:'%d %b, %Y' --pretty=format:'(%Cgreen%h%Creset)[%ad] %C(blue bold)%s%Creset'
    # search code in commit history: git wot :function_name:filepath
    wot = log --date=format:'%d %b, %Y' --pretty='%n%C(yellow bold)📅️ %ad%Creset by (%C(green bold)%an%Creset) %C(cyan bold)%h%Creset' --graph -L
    # top 10 most edited files
    top10 = ! git log --pretty=format: --name-only | sort | uniq -c | sort -rg | head -10

Resources & Internet Threads