LazyTube — A Personal Tale of My Lazy Video-Watching Habits
AKA: are you lazy like me? Large Language Models (LLMs) can help you be even more lazy.
The Spark
There’s a certain level of laziness that comes with wanting to get the gist of a 30-minute YouTube video in under two minutes. That’s me — sometimes. I’m the type who, after a long day, just wants to lay back and get the highlights without committing to the whole viewing experience. This is where LazyTube comes into play.
It all started when I found myself repeatedly skimming through YouTube videos. I’d click on a tutorial or a documentary, and halfway through, I’d lose interest. Not because the content wasn’t good, but because I simply didn’t have the patience to sit through the entire thing. I realized I needed a way to extract the key points quickly, without missing out on the important stuff.
After all, wasn’t Bill Gates the one who reportedly said that he would “[…] choose a lazy person to do a hard job [because] a lazy person will find an easy way to do it.” ?
Enter LazyTube
And so, on a rainy weekend, I set out to create LazyTube, a project aimed at solving my very specific problem. LazyTube is designed to extract transcripts from YouTube videos and summarize them using a Large Language Model (LLM). The idea is simple: get the essence of the video without the fluff.
Here’s a quick rundown of how LazyTube works:
- Extract the Transcript: Using the
youtube-transcript-api
, LazyTube pulls the video transcript. - Summarize with LLM: The transcript is then passed to an LLM for summarization.
- Interactive Query: You can interact with the summary to get more details or clarify specific points.
Getting Started with LazyTube
The setup process is straightforward. First, I made sure to install the necessary packages:
%pip install --upgrade --quiet langchain langchain-core langchain-community pytube youtube-transcript-api boto3 botocore
Next, I configured the LLM with the BedrockChat
class from langchain_community
and set up my model parameters. Here’s a snippet of the configuration:
import boto3
from langchain_community.chat_models import BedrockChat
# Bedrock Runtime
bedrock_runtime = boto3.client(service_name="bedrock-runtime",region_name="us-west-2")
# Model configuration
model_id = "anthropic.claude-3-haiku-20240307-v1:0"
model_kwargs = {
"max_tokens": 2048, "temperature": 0.1,
"top_k": 250, "top_p": 1, "stop_sequences": ["\n\nHuman"],
}
# LangChain class for chat
model = BedrockChat(
client=bedrock_runtime,
model_id=model_id,
model_kwargs=model_kwargs,
)
The Magic in Action
After setting up the model, it’s time to put LazyTube to work. The first step is always to figure out the right prompt to give to the LLM so that it can do your bidding. Here is what I ended up with:
from langchain.prompts import PromptTemplate
from textwrap import dedent
template = dedent("""\
You are a honest and helpful bot. \
Your goal is to help create content which is useful and informative. \
You are provided with the transcript of a YouTube video and the video infos. \
Here is the video transcript:
<transcript>\n{document}\n</transcript>
Summarize the video transcript and highlight the key learnings from the video.
These bullet points need to cover the main topics of the whole video, \
have to be useful and informative, and must be relevant to why a human \
should care about watching the whole video in the first place. \
Use these highlights to write a short blog, \
with one paragraph per key bullet point, \
and one introduction paragraph which introduces the topic, \
refers to the video title and speakers. \
Write the blog in a friendly language, neither formal nor too informal, \
in first person. Make sure the blog has the title of the paragraphs \
and the section are well-defined and separated. \
The output should be in Markdown format. \
Include a link to the video in the first paragraph.\
""")
prompt = PromptTemplate(template=template, input_variables=["document"])
By simply inputting a YouTube URL, LazyTube extracts and processes the video transcript, thanks to the YoutubeLoader
loader from the langchain_community
package.
from langchain.prompts import PromptTemplate
import langchain
from langchain_core.output_parsers import StrOutputParser
from langchain_community.document_loaders import YoutubeLoader
from IPython.display import Markdown
langchain.debug = False
# Ask the user to provide the youtube URL as input
url = input("Youtube URL")
# Parse and invoke the chain
video = YoutubeLoader.from_youtube_url(url, add_video_info=True)
chain = prompt | model | StrOutputParser()
answer = chain.invoke({"document": video.load()})
Markdown(answer.replace("\\n\\n", "\\n"))
This simple interaction allows me to get a concise summary and interact further if needed. No more scrubbing through videos or sitting through long intros and outros.
Cost Efficiency
One of the nice aspects of LazyTube was its cost efficiency. Because it uses Claude 3 Haiku powered by Amazon Bedrock behind the scenes as well as LangChain, here’s an easy way to get the costs of the invocation using LangChain Callbacks:
from langchain_community.callbacks.manager import get_bedrock_anthropic_callback
with get_bedrock_anthropic_callback() as cb:
answer = chain.invoke({"document": video.load()})
print(cb)
Here are a couple of expected costs:
- For a 10 minutes video, the cost is $0.0014
- For a 1 hour video, the cost is $0.005
Embracing the Lazy Life
LazyTube has transformed how I consume content. I can now get the highlights and delve deeper only when something truly piques my interest. It’s not about being lazy; it’s about being efficient with my time. And honestly, who doesn’t want that? Even better, I can have LazyTube take a video and write a blog or a LinkedIn post about it!
So, if you’re like me and sometimes just want the gist without the grind, give LazyTube a try. It might just change the way you watch YouTube.
Feel free to drop your thoughts or experiences in the comments. How do you manage your video-watching habits? Let’s chat!