Experimenting Natural Language Query using HuggingFace Transformers
Dated : Feb-2022
Over the recent years, it has always amazed me in how current softwares like Thoughtspot and Amazon Quicksight understands a query like ‘Who ran the most expensive query ?’ and responds back with the records and intelligently summarize the content. Sure Alexa, Siri, Google etc.. personnel assistants, Chatbots existed for sometime. I never got much excited until I was surprised by Thoughtspot.
For some time, I had gone through understanding concepts, process of understanding Natural Language Processing. After many random walk across with NTLK, Spacy, Tensorflow and numerous others; I finally landed upon HuggingFace. After going through “HuggingFace course”, I learnt how easy it is get started in NLP naively and gain more knowledge.
The knowledge to get started has become a lot easier, compared to 2017’s, that a novice developer (to NLP) can get started easily.
What you see on the left, is accomplished with as few as 5 lines of code. Here is the code hosted in Google Co-lab.
With latest advancements in NLP specifically around GPT-3; it is just going to blow up. As I see it, NLP has been exploding for a couple of years and now its going to a big wave of capabilities that is going to be build.
What resources got me here ?
My learning curve has been very much like a Gradient Descent and Back propagation to get some excited start, its a ML pun :-)
Unlike most experienced people, I did not end up on courses/university/books. Bit like a Gradient descent and initial weights I started the journey from You-tubes, blogs for inspirations and understandings.
If for 2022, your resolution is to learn something new and challenging then here are some links to get started in NLP.
- Code Basics you tube videos. I have deep respect for Mr. Dhaval Patel on is effort to get us started.
- Tensorflow you tube videos.
- HuggingFace course
There are many wonderful guys/gals to get you started on your journey into DL, NLP etc.. Every one has different styles and approach and I highly apprieciate each of thier effort. I specified the above, only because the presentations resonated to my learning speed/frequency. Your mileage may vary.
Also a passive slow learner, it took me 1–2 months to gather some initial concepts and understand concepts. Though this just tip of the iceberg and lot more to learn.
Here are some ideas, that I had, may be get you going. Remember these can be done easily with SQL, but focus on trying to achieve this using NLQ (Natural Language Query).
- Using Snowflake query history, determine who is running the costly queries and how often. I actually did this in my Snowflake account.
- Using ThinkNum’s Job Data from Snowflake Data Marketplace; query to find the companies who are looking for NLP developers :-) ?
- Using GPT-3 transformers, summarize emails in your inbox or a presentation
Get inspired by :
- AWS Quicksight
- Assembly AI
What I achieved was just a start. It is no means a robust model, but a mean to get started. Over the course, I am never going to be a “Data Scientist” but becoming proficient in DL (Tensorflow, Pytorch) is definitively a potential. If I could become one, so could you. I hope in 2022, you add NLP to your Skillset and share your journey.
Feed-Forward >> Back propagate >> Repeat !!!