How I Use AI Products

02 Jan, 2025

header-image-creative

Created by Claude: "Animated neural network connections with pulsing data flows. Glowing nodes representing different AI capabilities. Floating particles to add movement. Iconic representations of different AI functions (code, brain, voice, vision). Animated connecting lines with flowing patterns."

Subscribe to my blog via email or RSS feed.

Disclaimer: This article reflects my personal experience with AI tools as of January 2025. Given the rapid pace of AI development, some information may become outdated quickly. Tools and features mentioned here may change, improve, or become obsolete.

We are experiencing a period of accelerated AI evolution. AI capabilities are growing much faster than people can absorb and leverage them. As a result, there is a widening gap between available capabilities and what people are aware of and able to use. This applies at all levels—from individual users to entire departments and companies—and across all roles, from knowledge workers to business managers.

Claude suggests calling it the AI Horizon Gap (like the horizon that keeps moving as you approach it).

There is significant value in bridging this gap. By doing so, you can effectively project yourself several years into the future.

In 2025, I would like to help more people bridge the AI Horizon Gap. While I'm not sure how to accomplish this, I'm starting with the small step of sharing how I use AI products and what I've learned in the process.

Table Summary

Claude has created a table that summarizes this entire article.

Screenshot 2025-01-03 at 15

General Queries

For general inquiries, learning about new topics, or using image understanding capabilities, I rely on two main products:

ChatGPT (with the standard model GPT-4o) has become my go-to product for most queries due to its speed and low of rate limits. The native apps for smartphones and Macs are excellent, featuring voice input capabilities (powered by Whisper) that enable voice-to-text prompt conversion. I particularly appreciate the ability to play messages aloud. The recently added Search provides quick access to fresh web data, and while the Memory function is currently only moderately useful, these features collectively make ChatGPT my primary choice for general queries.

Claude (with the standard model Sonnet 3.5) offers superior intelligence, self-awareness, and personality compared to GPT-4o. It also features a larger context window, allowing it to process more data when responding. While I generally prefer conversing with Claude, it faces some usability challenges: the rate limit is quite restrictive, and frequent service interruptions disrupt work flow. The mobile and MacBook apps are sluggish and less polished. Though voice input was recently added to the mobile app—an essential feature for on-the-go use—the quality is currently subpar and often misinterprets messages.

Due to these limitations, I reserve Claude for situations requiring particularly insightful answers, personal advice, or in-depth topic exploration.

Web Search

Web search is inherently an agentic problem, requiring iterative exploration of web space to achieve the user's objective. While we've seen significant progress in this area, it remains far from fully solved today.

Perplexity Pro remains the leading product in this space. I picked Claude 3.5 Sonnet as the underlying model. Perplexity formulates comprehensive search plans that typically yield accurate and helpful results. While it hasn't yet achieved fully agentic search capabilities—such as spending an hour or two compiling lists of relevant websites and extracting information—it consistently proves useful for most search needs.

ChatGPT Search poses the most significant challenge to Perplexity. While not quite matching Perplexity's capabilities in my experience, it offers impressive speed and general utility. Its main advantage lies in its integration with the broader ChatGPT ecosystem, allowing search to fit seamlessly into larger workflows. The Advanced Voice Mode feature enables live discussion of web results.

Google Deep Research shows promise in delivering truly agentic web search. It creates customizable research plans and dedicates significant time to thorough website analysis—ideal for when you want accuracy and don’t care about speed. While the current product hasn't fully met my expectations, I'm confident that it will improve quickly.

Personal And Career Advice

For deeper life questions—from handling delicate personal situations and understanding emotional matters to career direction and financial optimization—there’s only one AI I trust.

Claude Sonnet With Projects is my only trusted advisor for personal matters (sorry ChatGPT!). Projects in Claude enables organized storage of prompts, conversations, and data. I've created a dedicated Personal project containing relevant documents about myself, including my resume*, which updates automatically through Google Docs integration. While Claude cannot access conversations from previous sessions, even within the same project, I've developed a workaround for implementing a 'memory' feature: at the conclusion of each conversation, I request that Claude summarize our discussion and its learnings about me in an artifact, which I then add to the project knowledge. This creates a gradually expanding understanding of my background and circumstances.

(*Specifically, I've attached my resume as a Google Doc connection, ensuring Claude always accesses the most current version.)

Screenshot 2025-01-02 at 09

My Personal Claude project where I give it context about my life situation and ask for advice.

Data Analysis

This is about loading, cleaning, transforming, visualizing, and deriving insights from data.

ChatGPT's Data Analysis tool continues to lead. Its intuitive interface allows me to view tables and manually select columns, rows, or cells. It can analyze multiple files simultaneously, making comparisons and combinations. While it effectively handles many analytical tasks, it may not suffice for complex analyses requiring sophisticated code and multiple file integrations.

Google Colab with Gemini is my preferred solution for complex data analysis. It combines the flexibility of Jupyter-style notebooks with Gemini's capabilities as a coding copilot. It’s not perfect, but I’ve noticed a lot of improvements lately.

Claude's new data analysis tool, which runs JavaScript in the background, offers impressive visualizations and leverages Claude's intelligence. However, I haven't yet found a compelling need to incorporate it into my workflow.

Screenshot 2025-01-03 at 16 ChatGPT's Data Analysis tool. You can see it has identified the 4 cells I selected.

Reading PDFs

Vast amounts of human knowledge is locked in PDFs. It’s no wonder that “chat with your PDF” products have sprung up like mushrooms in the AI era.

Claude excels in this domain. Its PDF feature stands out by actually "looking" at PDFs, utilizing Claude's visual understanding capabilities. This represents a significant advantage over alternatives, including ChatGPT, which typically rely on programmatic text extraction. This distinction is particularly crucial for interpreting visual elements such as charts and tables. While occasional hallucinations still necessitate verification, I've been consistently impressed by Claude's comprehensive understanding of PDF content.

Voice Transcription

Voice transcription, while often overlooked in AI discussions, has saved me an incalculable amount of time.

OpenAI's Whisper effectively addresses 99% of my voice transcription needs. To maximize its utility, I've implemented two simple solutions:

First, I created a Telegram bot that converts voice messages into Whisper transcriptions. This allows me to capture ideas on the go and efficiently process voice messages from others when listening isn't convenient.

Second, I developed whisper-keyboard, a straightforward package enabling Whisper-powered writing on my laptop.

My Telegram bot transcribing my voice notes.

Coding

Claude Sonnet 3.5 remains unmatched in coding capabilities. Its knowledge base, updated through April 2024, includes recent libraries, making it my primary resource for specific coding questions.

Cursor has increasingly become my preferred code editor. Built from the ground up with AI integration in mind, it delivers impressive results, particularly when powered by Sonnet 3.5. My typical workflow begins with discussing and planning projects with Claude in their native app, then transitioning to Cursor for the actual coding process, where I spend the majority of my development time.

For quick coding queries, I also utilize ChatGPT. Its recent ability to directly interface with Cursor and other MacOS applications has significantly reduced friction. This feature proves especially valuable for debugging terminal issues, which often present the most challenging problems.

ChatGPT 'Work with Apps' debugging my terminal.

While several Cursor competitors are emerging, I haven't yet explored their offerings.

A new category of tools aims to generate complete websites and web applications from prompts. I've briefly experimented with Vercel v0 and Lovable, appreciating their potential to automate frontend web development, a challenging area for me. These platforms promise one-click deployment solutions, and early results have been impressive.

However, I've encountered difficulties during actual deployment, as unexpected bugs and issues emerge that the models can't automatically resolve. Manual debugging often becomes necessary, which proves time-consuming given my limited frontend expertise.

Despite current limitations, this field shows tremendous promise, and I anticipate significant improvements throughout 2025.

Live Assistance

Live assistance technology is in its early stages, with significant developments expected in 2025.

Currently, ChatGPT's Advanced Voice Mode stands as the primary available product. Google is also making strides in this area, offering an impressive and free experimental preview on AI Studio.

ChatGPT Advanced Voice offers several capabilities: real-time conversation, Memory feature integration, web searching, and both phone screen and camera vision functionality. When used with the MacOS app, it can interface with specific applications in real-time through Work with Apps, including Apple Notes, Cursor, and Notion.

While the technology is impressive, its current utility remains somewhat limited. The voice interface feels robotic and tends to interrupt conversations after brief pauses. Users cannot simultaneously use voice and text, and daily usage is restricted to one hour (less when using vision features). The vision functionality operates at relatively low resolution, capturing intermittent screenshots rather than continuous monitoring, limiting its ability to interpret dynamic events. Additionally, it cannot perform actions on behalf of users.

I primarily use Voice Mode for rapid-fire questions. One application I’ve discovered for live voice + vision is cataloging medicine cabinets, with real-time explanations of each product's purpose.

Video Understanding

Currently, video analysis capabilities are primarily available through a single, free service:

Gemini models in Google's AI Studio offer remarkable video analysis capabilities. While not perfect, their functionality is impressive. For instance, when I needed to extract data from a website lacking export features, I recorded a screen capture of my website navigation. By uploading this video to Gemini and requesting JSON-formatted data extraction, I received surprisingly accurate results with minimal errors, though verification was still necessary.

Learning Stuff

NotebookLM by Google stands out as an exceptional learning tool, leveraging Gemini's intelligence and extensive context window. The platform allows users to create notebooks and upload multiple sources for summarization and inquiry. Gemini responds to questions by directly referencing your sources, ensuring grounded responses. One particularly fascinating feature is the AI podcast generator, which creates remarkably realistic "deep dive" episodes featuring two hosts discussing your sources. I experienced this firsthand when I uploaded my master's thesis—for once, I had the illusion that people were actually reading my work! 😀 Most impressively, many of these features are available at no cost.

Of course, both Claude and ChatGPT remain valuable learning companions.

Screenshot 2025-01-03 at 16 The NotebookLM notebook where I loaded the contents of my SQL course.

Voice Generation

ElevenLabs has established itself as the industry leader in voice generation. Their reader app for mobile devices has become particularly useful for converting articles into audio content for on-the-go consumption.

Music Generation

The music generation space has provided entertaining experiences through platforms like Suno and Udio. I’ve had fun using both of these and don’t have a particular preference. If you’re into music, let me know which you like best.

Creative Writing

The October release of Claude 3.5 Sonnet represented a significant leap forward in creative writing capabilities. For the first time, I've been able to generate prose that meets my standards, successfully transforming several of my ideas into stories I genuinely appreciate. I'm currently exploring how to publish these stories.

My creative process involves using a Claude Project dedicated to short story writing, including style guidelines and writing principles (with particular emphasis on 'show don't tell'). When I get a story idea, I talk with Claude to explore potential plot points and developments. The actual writing process involves crafting the story in manageable segments, continuously providing feedback and refinement until we reach a satisfactory result.

I've found Claude to be an attentive listener, empathetic author, and genuinely creative writer. These stories wouldn't exist without its assistance, and I believe it deserves equal authorial credit!

Reasoning

OpenAI's o1 has introduced reasoning models capable of extended deliberation before responding. Currently, ChatGPT Plus subscribers can access both o1 and o1-mini.

While I'm still discovering their optimal applications, I typically employ these models when analyzing large volumes of information in detail. They are good at strategic thinking, particularly in simulating various scenarios and determining optimal responses. Though I occasionally consult them for coding challenges, they don't consistently outperform Claude Sonnet in this area.

A standout feature of OpenAI's 'o' series is their capacity for extended output, allowing them to generate comprehensive essays in a single pass—a task that would require multiple iterations with other models. For instance, it generated a 17-chapter essay of the impact of AI coding on the job market.

Two other notable reasoning models, Gemini Flash Thinking and DeepSeek R1, remain on my exploration list.

Medical

For medical inquiries, which fortunately arise infrequently, I primarily rely on OpenAI's 'o1' model, especially since it gained image understanding capabilities. Both benchmarks and anecdotal evidence suggest it leads the field in medical knowledge. I consult Claude Sonnet as a secondary opinion.

I've had positive experiences analyzing blood test results and deriving actionable insights, though I encourage readers to form their own judgments as I'm not a medical expert.

Large summarization and extraction

When dealing with extensive text analysis, two main approaches are available:

Upload the entire text to Gemini (AI Studio), which currently supports a 2 million token context window.
Develop code to segment the text and process chunks in parallel using efficient, cost-effective AI models. GPT 4o-mini, Gemini Flash 2.0, and DeepSeek-V3 would be my current recommendations.

While I've successfully implemented the chunking approach several times, I have limited experience with Gemini's expanded context window capabilities.

Stuff I Don’t Do

Several AI applications remain mostly unexplored in my current workflow:

Image and video generation
Roleplaying platforms like Character.ai
AI-assisted scientific research (an area I'm eager to explore)
Likely others that don't immediately come to mind

If you have any opinions on these, please share!

Conclusion

I feel immense gratitude for the array of powerful tools at our disposal. However, I'm acutely aware of the gap between available capabilities and widespread user awareness and proficiency. I hope to contribute to bridging this divide, helping others unlock the tremendous value these tools offer.