GSoC-22
Contributed to š„LIT from Google PAIR as part of the GSoC 2022 program!
I participated in GSoC for the first time in 2020. It was a fun experience and eventually was helpful during the process of me landing my first job in software development as well. After spending about a year in that role, I decided to go for something that I have always wanted to do, a PhD in Computer Science. I am starting as a PhD student at George Mason University this Fall (2022). But before I start my graduate journey, I wanted to re-experience GSoC one last time (you can be a GSoC contributor only twice). When I saw š„LIT offering projects for this yearās GSoC, I knew I had to atleast try.
And so I did! I wanted to apply for a lightweight project because that would give me sufficient space to enjoy my time as a contributor, while I prepared to move across continents towards the end of my project. The š„LIT Tutorials project fit this description perfectly. I took some time to understand basic š„LIT components and created a project proposal. This yearās proposal was only 10 pages long (compared to the 40 page behemoth from my previous gsoc application), but it was fundamentally solid. I submitted this and waited for about a month for the results. I was very happy to be selected, because this meant I could spend the summer working on something closely related to my PhD topic. I reached out to my mentor and we had an orientation call soon after. I also met other members of the š„LIT team and one of my fellow contributors as well during the community bonding period.
The ācoding periodā is a bit unique this time, because some of my contributions arenāt exactly ācodeā. To explain, this basically means that I went through the development and feedback process for the tutorials mostly in a google docs environment and then converted it to markdown for rendering in the š„LIT website. Also during this entire process, I had to ideate and run multiple experiments using š„LIT, which sometimes led me to discover bugs. As per feedback on my midterm evals, all of this was helpful to the š„LIT team in making š„LIT a more robust and accessible OSS project :)
Since I will be linking to this blog as my final work product submission, I will link to the relevant issues and PRs below, and add in a short description for them.
Tabular Feature Attribution Tutorial :
- Focuses on Tabular Feature Attribution based on the Kernel SHAP method.
- Uses the penguins demo from the LIT website to illustrate the different ideas.
Text Salience Maps Tutorial :
- Focuses on Text Salience Maps, specificially Grad L2 Norm, Grad Ā· Input, Integrated Gradients and LIME.
- Uses the GLUE demo from the LIT website to illustrate the different ideas.
Issues :
Pull Requests :
Workflow :
- We (me and Ryan) had regular weekly meetings. For these meetings we kept a google doc for meeting notes, where we noted down action items for the next week and brief notes on items discussed in the current week. The meetings covered a wide range of discussions from solving queries in the tutorial creation process, to reviews and feedback, to brainstorming and more.
- For brainstorming, I used a google doc to explore each of the two modules of interest in depth. I noted down possible questions that different types of users might have. And then I created a draft outline by organizing those questions into a structured flow.
- From the outline, I expanded the content into the final draft, which was reviewed multiple times by many people on the LIT team. I incorporated all the feedback into the draft before starting with the PR process.
- For the PR process, I created a branch from
dev
for each of the 2 PRs. - Within the branch, I added the content within a markdown file. Converting from google docs to markdown seemed a very time-consuming task at first, but a google docs extension helped me out with this (Docs to Markdown by Ed Bacher).
- I downloaded the google doc file as a website, to get all my images in one place. But this downloads the images with a slightly arbitrary filenaming system. So, then I went through the HTML file to rename the images in a better readable format for my PR. I added these images and the modified markdown output to my branch. And boom, done!
- A google UXD team would be adding in some final touches to the images (adding annotations and converting some of them to gifs) once the review process is complete, and then the tutorials would be hosted on the live website for LIT.
Overall, this GSoC was amazing! My mentor, Ryan is absolutely the G.O.A.T. Couldnāt have completed this smoothly without all his help!
Ryan is the G.O.A.T. (image generated using Dalle, thank you Open AI for giving me access) š
Thank you to everyone at š„LIT, for considering me for this project. Couldnāt have asked for a better way to spend this summer.
See you soon! Bye šš»šš»