Generate a RSS Feed of recent files inside a Git repository

Posted on 16 Mar, 2021
I like to log my notes & TILs in a git repository and recently I had an idea to showcase (& automate) my most recent learnings on my github profile.
Github is nice enough to provide us with RSS feeds for the latest commits inside a repo but it lacks the most basic thing of telling me what commit introduced a new file (i.e recent files in a repo).
The following command will show each new relative path that was added to the git history along with the commit date (sorted by most recent).
git log --no-color --date=format:'%d %b, %Y' --diff-filter=A --name-status --pretty='%ad'
If you just want the file names, leave the --pretty option empty
git log --no-color --date=format:'%d %b, %Y' --diff-filter=A --name-status --pretty=''
You should see something like this
A scripts/oib
A scripts/surf
A snippets/python.snippets
A snippets/markdown.snippets
A .Xmodmap
A codesnippets/
A scripts/areyouok.go
A scripts/
A scripts/
To generate recent N results use the -n flag
git log --no-color -n 5 --date=format:'%d %b, %Y' --diff-filter=A --name-status --pretty=''
If you want to follow renames as well,
git log --no-color --date=format:'%d %b, %Y' --diff-filter=AR --name-status --pretty=''
The magic here is done by the --diff-filter=A option that only shows files that were Added. I remember using this to find birthday of README files.
NOTE: We are assuming that the file creation date to be the date of the commit that introduced the file and since its a Feed for a git repo, this should make sense (I was born when I was committed 😁️)
#!/usr/bin/env python3
# Script to generate a feed of recently committed files in a git repository
# TODO: Add Commit Author #
import subprocess as sp
import pathlib
import re, os
import datetime
from dateutil.parser import parse
HEAD = """<?xml version="1.0" encoding="UTF-8" ?>
<rss version="2.0" xmlns:atom="">
FOOTER = """</channel>
# Assuming your current working dir is the repo
repo_name = os.path.basename(os.getcwd())
current_date ="%a, %d %b %Y")
def get_recent_files():
cmd = "git log --no-color -n 10 --date=rfc --diff-filter=A --name-status --pretty='%ad'"
result = sp.Popen(cmd, shell=True, stdout=sp.PIPE, stderr=sp.PIPE)
out, err = result.communicate()
clean_output = out.decode("utf-8").replace("A\t", "").split("\n")
clean_output = list(filter(lambda x: x != "", clean_output))
files = []
for item in clean_output:
if is_valid(item):
date = item
elif pathlib.Path(item).exists():
entry = item, date
return files
def is_valid(date):
if isinstance(parse(date), datetime.datetime):
return True
except ValueError:
return False
def get_repo_link():
repo_origin = "git config --get remote.origin.url"
result = sp.Popen(repo_origin, shell=True, stdout=sp.PIPE, stderr=sp.PIPE)
result, err = result.communicate()
return result.decode("utf-8")
if __name__ == "__main__":
files = get_recent_files()
with open("feed.xml", "w") as feed:
f"""<description>Recently committed files in {repo_name}</description>\n"""
for item in files:
The XML generated is valid enough to be consumed without any issue. Here is a demo of output from the above script.
<?xml version="1.0" encoding="UTF-8" ?>
<rss version="2.0" xmlns:atom="">
<description>Recently committed files in til</description>
<lastBuildDate>Wed, 17 Mar 2021</lastBuildDate><item>
<pubDate>Tue, 16 Mar 2021 16:24:47 +0530</pubDate>
<pubDate>Tue, 16 Mar 2021 16:24:47 +0530</pubDate>
<pubDate>Mon, 15 Mar 2021 19:26:12 +0530</pubDate>
<pubDate>Sat, 13 Mar 2021 13:11:02 +0530</pubDate>
<pubDate>Sun, 7 Mar 2021 19:42:28 +0530</pubDate>