February 23rd, 2026
Once again, we had only one new data point: the BAFTAs. One Battle After Another took home the best film there, bringing its chances of winning Best Picture to 77%…
Read MoreFebruary 23rd, 2026
Once again, we had only one new data point: the BAFTAs. One Battle After Another took home the best film there, bringing its chances of winning Best Picture to 77%…
Read MoreI wrote a post last year looking at how to employ tools in LangChain to have GPT-3.5 Turbo access information on the web, outside of its training data.
The purpose of the present post is to revisit this post, improving the poor performance I saw there through refactoring and prompt engineering.
The motivating example is again using large language models (LLMs) to help me calculate features for my Oscar model. Specifically: How many films did the director of a Best Picture nominee direct before the nominated film?
Read MoreI wrote a post in July 2023 describing my process for building a supervised text classification pipeline. In short, the process first involves reading the text, writing a thematic content coding guide, and having humans label text. Then, I define a variety of ways to pre-process text (e.g., word vs. word-and-bigram tokenizing, stemming vs. not, stop words vs. not, filtering on the number of times a word had to appear in the corpus) in a workflowset. Then, I run these different pre-processors through different standard models: elastic net, XGBoost, random forest, etc. Each class of text has its own model, so I would run this pipeline five times if there were five topics in the text. Importantly, this is not natural language processing (NLP), as it was a bag-of-words approach.
The idea was to leverage the domain knowledge of the experts on my team through content coding, and then scaling it up using a machine learning pipeline. In the post, I bemoaned how most of the “NLP” or “AI-driven” tools I had tested did not do very well. The tools I was thinking of were all web-based, point-and-click applications that I had tried out since about 2018, and they usually were unsupervised.
We are in a wildly different environment now when it comes to analyzing text than we were even a few years ago. I am revisiting that post to explore alternate routes to classifying text. I will use the same data as I did in that post: 720 Letterboxd reviews of Wes Anderson’s film Asteroid City. There is only one code: Did the review discuss Wes Anderson’s unique visual style (1) or not (0)? I hand-labeled all of these on one afternoon to give me a supervised dataset to play with.
Read More