AI Nerding
3/5/2024, 9:10:25 PM

Messages and calls are end-to-end encrypted. No one outside of this chat, not even WhatsApp, can read or listen to them.

AI Nerding
3/5/2024, 9:10:25 PM

You created group “AI Nerding”

Chris Boden
3/5/2024, 9:10:37 PM
Luc Pettett
3/5/2024, 9:19:22 PM

Add pip!

Luc Pettett
3/5/2024, 9:20:19 PM

Can you make me admin @61421850424 ? Will add Andrew from my team

~ Andrew Bucknall
3/5/2024, 9:21:30 PM

Luc Pettett added ~ Andrew Bucknall

Luc Pettett
3/5/2024, 9:21:42 PM

@61488427815 welcome!

Luc Pettett
3/5/2024, 9:22:09 PM

Chris’s welcome post:

Luc Pettett
3/5/2024, 9:22:11 PM
Chris Boden
3/5/2024, 9:24:43 PM

Hi Andrew

Chris Boden
4/5/2024, 8:35:15 AM

Post from Mike Knoop, Zapier founder, on their AI dev best practices
"Building AI-first products with LLMs

We have now experienced the full dev lifecycle of many AI-first products
@zapier
(AI Actions, AI bots, Zap guesser, Chatbot builder, more). Here is the advise I give to our AI eng teams."

Chris Boden
4/5/2024, 8:40:17 AM
Image
Pip Bingemann
4/5/2024, 11:16:02 AM

You added Pip Bingemann

Chris Boden
4/5/2024, 11:16:23 AM

Welcome Pip

Chris Boden
4/5/2024, 2:07:48 PM

Anything of interest on the AI front at Sunrise?

Pip Bingemann
4/5/2024, 2:09:43 PM

Nah. Content was pretty average for most part. Good event but wouldn’t go if primary reason is to learn. Better for networking etc.

Chris Boden
4/5/2024, 2:10:21 PM

so, are we reasonably up the game here in the provinces?

Pip Bingemann
4/5/2024, 2:10:51 PM

100% hub exchange is better 😁

Luc Pettett
4/5/2024, 3:44:47 PM
Image
Chris Boden
5/5/2024, 9:27:02 AM

Braintrust seems to be the topic of the day in my feed. Auto eval saas for gen ai.

https://www.braintrust.dev/

Kieran Browne
5/5/2024, 4:07:07 PM

You added Kieran Browne

Luc Pettett
6/5/2024, 8:49:35 AM

great timing!! testing it this morning to overcome our classification issues

Chris Boden
7/5/2024, 11:16:00 AM

Had a play with crewai - one of the agent frameworks. Quite interesting to read through its internal monologue as it works through the tasks it has. Thought I'd share here in case anyone wants to take a look.

The problem was: "I am a founder and my startup has created an AI co-pilot that takes the role of sales & bd co-founder. Please develop a go-to market plan for our platform: Superfounder."

https://docs.google.com/document/d/1Afn6kxPJ-JGZNToMzpw0esQIKFn26-WSmzae8T7GDQw/edit?usp=sharing

Chris Boden
7/5/2024, 11:25:14 AM

Am using llama3 80b for this, via openrouter
36 api calls
Approx 50-60k input tokens
<5k output tokens
total cost: $0.07

Chris Boden
7/5/2024, 11:34:36 AM

[spoiler alert: the agile coach pulled things off the rails]

Chris Boden
7/5/2024, 11:39:22 AM
Image
Chris Boden
7/5/2024, 11:50:00 AM

Quite a few useful outputs along the way even though the final output not great

Chris Boden
8/5/2024, 3:30:40 PM

lol "“this is the biggest day for iPad since its introduction” in 2010"

Luc Pettett
8/5/2024, 3:31:16 PM

urk

Luc Pettett
8/5/2024, 3:31:20 PM

any AI reveal?

Chris Boden
8/5/2024, 3:32:25 PM

not from what i can see

“With iPad Pro with M4, users can perform AI-enabled tasks even faster, like easily isolate a subject from its background in 4K video with just a tap with Scene Removal Mask in Final Cut Pro. With this advanced level of performance, the neural engine in M4 is more powerful than any neural processing unit in any AI PC today.”

It comes as rival Microsoft – which overtook Apple to become the world’s most valuable company earlier this year – is set to unveil many AI-enabled laptop computers, including models powered by Qualcomm’s Snapdragon processor, at its Build conference in Seattle later this month.

Chris Boden
8/5/2024, 3:34:14 PM

who needs an overpriced AI wearable when you can get an overpriced AI paperweight

~ Andrew Bucknall
8/5/2024, 3:35:30 PM

The unified memory architecture seems to be miles ahead of typical computers for doing local inference.

~ Andrew Bucknall
8/5/2024, 3:36:53 PM

With MS being cozy with OpenAI surely they’ll want to maximise cloud usage (and sell your data?)

Luc Pettett
8/5/2024, 3:37:57 PM

opensource will fit with apples mantra.

Luc Pettett
8/5/2024, 3:38:28 PM
Image
Chris Boden
8/5/2024, 3:38:55 PM

i think the one caveat in the llama licensing is for bigcos

Chris Boden
8/5/2024, 4:26:07 PM
Chris Boden
8/5/2024, 4:37:40 PM
Image
Luc Pettett
9/5/2024, 9:31:20 AM

interesting an frustrating LLM problem

Luc Pettett
9/5/2024, 9:31:35 AM
Image
Luc Pettett
9/5/2024, 9:32:46 AM
Image
Pip Bingemann
9/5/2024, 10:59:55 AM

- THIS is the coolest thing ive seen in a while.

Pip Bingemann
9/5/2024, 11:00:10 AM

so many cool applications here.

Pip Bingemann
9/5/2024, 11:01:01 AM

and this is a little wild. https://websim.ai/ - a fully simulated internet. All powered by claude.

Luc Pettett
9/5/2024, 11:07:22 AM

woh websim super interesting

Pip Bingemann
9/5/2024, 11:08:45 AM

yeah websim is wild and fun. that google notebook thing i think is game changing and not for notetaking purposes.

Scott Hannan
9/5/2024, 1:43:18 PM

You added Scott Hannan

Chris Boden
9/5/2024, 1:27:17 PM
Image
Chris Boden
9/5/2024, 1:43:42 PM

Friends, I'm just adding our Silicon Valley correspondent to the group, Scottie Hannan

Chris Boden
9/5/2024, 1:44:55 PM

Scott is a long time friend of mine who went to the Valley and did good while the rest of us came home with our tails between our legs. https://www.linkedin.com/in/scotthannan/

Luc Pettett
9/5/2024, 2:08:17 PM
Image
Luc Pettett
9/5/2024, 2:08:22 PM

need to switch up the words!

Chris Boden
9/5/2024, 2:12:08 PM

Scott, for context:

Luc & Andrew are building telescope.co
Pip & Kieran are building sprinboards.ai
Aaron sold Retail Express to Maropost last year and is building a new AI platform
Guys, Scott led platform partnerships at FB after they acquired his startup, then went on to run partnerships at Pinterest, then tried to save Sam Harris from himself, then helped vidmob with their stuff. Recently been helping AMZN with their gen ai stuff for merchants, etc.

Chris Boden
9/5/2024, 2:13:32 PM

Lives in Palo Alto and very plugged into what's happening in the Valley through his Fb, Pinterest etc networks. Also nerding out on AI. Scott wrote the post I shared this morning

Scott Hannan
10/5/2024, 7:50:14 AM

Thanks Chris and nice to meet you all. Silicon Valley correspondent, I like that ;)

As my first nerd contribution I came across this today. TLDR using an LLM to prompt an LLM, with inputs to the former being trained on pairs of prompts with performance metrics. Iterative model driven prompting. www.arxiv.org/abs/2309.03409

Chris Boden
10/5/2024, 10:02:06 AM
Chris Boden
10/5/2024, 10:02:06 AM

this is the first AI generated track that has connected with me.

Chris Boden
10/5/2024, 10:02:37 AM
Image
Chris Boden
10/5/2024, 10:04:29 AM

they launched an inpainting feature yesterday where you can start to edit tracks - take a part of the wave form, eg bars 12-16 and say piano solo here, or choral background or whatever.

I guess in this instance the creator has fed in the amazing grace lyrics, specified gospel style and then inpainted various bits of accompaniment

Chris Boden
10/5/2024, 10:09:38 AM

"OpenAI plans to announce Google search competitor on Monday, sources say"

Let's head over to our SV correspondent for the latest...Scott?

Scott Hannan
10/5/2024, 10:32:22 AM

Ha ha well it was supposed to be last week but they postponed. It’s a no brainer/gap for them, they might have even delayed so MSFT could position themselves. If it’s good it will hit Perpelxity pretty hard. And P and others are all pursuing enterprise licenses too, where OpenAI has far more install base and resources. Eg why should we license P as well if we already have ChatGPT (now with search)?

Aaron Blackman
10/5/2024, 7:03:10 PM

Welcome Scott 🙌. PS Perplexity will be easy to beat, and cant wait (their hallucination rate is super high when prompting for detailed data /metrics across various domains - too much reliance on their own proprietary model..)

Aaron Blackman
10/5/2024, 7:14:08 PM

gents if you're looking to validate ideas and research early for PMF - cant recommend this platform enough for conducting live qualitative 1:1 interviews with your future customers

https://www.userinterviews.com/

~ Andrew Bucknall
10/5/2024, 7:26:28 PM

There’s a Brissie-based startup that does something similar too https://www.askable.com/

Scott Hannan
11/5/2024, 8:10:58 AM

10am Monday all set. Let’s see what it brings. I’m still once bitten from Kevin Systrom announcing: This. Changes. Everything. (As in, the world over) when IG added video ;)

Chris Boden
11/5/2024, 8:24:18 AM
Image
Chris Boden
11/5/2024, 8:47:59 AM

New feature from Anthropic for helping with prompt engineering.

Chris Boden
11/5/2024, 8:49:25 AM

“Describe what you want to achieve, and Claude will use prompt engineering techniques like chain-of-thought reasoning to create more effective, precise and reliable prompts.”

Aaron Blackman
11/5/2024, 10:03:48 AM

Useful platform here if you're looking to connect unstructured external data sources to your LLM:
https://airbyte.com/solutions/artificial-intelligence

Chris Boden
11/5/2024, 2:27:32 PM
Image
Chris Boden
11/5/2024, 7:09:59 PM

Interesting thread on AI Lawyers.

Jeez: “Today Google employs 828 "product counsel." That's more than only the biggest law firms.”

Chris Boden
12/5/2024, 8:46:22 AM
Image
Chris Boden
12/5/2024, 8:49:09 AM

OAI put out their new model spec a few days ago with examples like this 👇which seem to back up that 👆prediction

Chris Boden
12/5/2024, 8:49:15 AM
Image
~ Andrew Bucknall
12/5/2024, 9:43:12 AM

I was testing the gpt2 bots to see if they had recent knowledge but they don’t. Will be interesting to see what drops! Fwiw I’m a big fan of Kagi, been using it full-time instead of Google for over a year, and they have an LLM-summarised “quick answer” feature that works quite well.

Chris Boden
12/5/2024, 8:09:08 PM
Image
Chris Boden
12/5/2024, 8:10:03 PM
Image
Chris Boden
12/5/2024, 8:16:00 PM

Sorry, that answer above was in response to my question
"what impact did the speculation about openai launching a search product, have on google share price"

Luc Pettett
12/5/2024, 8:31:52 PM

crikey that's good

Luc Pettett
12/5/2024, 8:32:51 PM

I know @61423698470 uses perplexity quite a bit given they provide sources, helps the user build confidence in the response

Luc Pettett
12/5/2024, 8:33:26 PM

that's my expectation of tomorrows OpenAI release, referenced sources which in turn is a reverse engineered search engine

Chris Boden
12/5/2024, 10:27:43 PM
Scott Hannan
13/5/2024, 10:27:50 AM

Husband bot time :)

Luc Pettett
13/5/2024, 4:11:53 PM

ah my wife is froffing for an upgraded model, she's been stuck with LucGPT3.5 for 20 years poor woman. the context window sucks, no implicit memory, inference costs are so high and availability so unreliable.

Luc Pettett
13/5/2024, 4:13:48 PM

luckily the temperature is high it keeps things interesting

Chris Boden
13/5/2024, 5:07:43 PM

constant hallucination, wildly variable outputs, massive training cost

Chris Boden
13/5/2024, 5:08:56 PM

New AI saas product category, one off purchase of software that runs on local o/s models. Example
https://pdfpals.com/buy

Chris Boden
13/5/2024, 5:10:47 PM
Image
Luc Pettett
13/5/2024, 5:14:01 PM

Interesting. It’s like buying a bundle of inferences

Chris Boden
13/5/2024, 5:14:59 PM

now that there is openai-level o/s embedding models that run locally, vector search basically makes its way into desktop apps

Chris Boden
13/5/2024, 5:15:49 PM

can have a very good 'free' rag experience with llama3 7b model and nomic embedding model both running locally via ollama

Chris Boden
13/5/2024, 5:18:08 PM
Image
Chris Boden
13/5/2024, 5:27:48 PM
Image
Scott Hannan
14/5/2024, 3:35:39 AM

Well that’s it, tough to manage expectations theses days. Impressive cross context progress though it was pretty glitchy demo. RE Apple, I guess they abusing the maps and search experiences as a guide. Eg either not ready (maps) or not core (search), while extracting a large fee and probably rights no one else has to underlying data. They still get arms length of the AI does bad things. Still if AI becomes part of the OS (which surely Gemini will) Apple are either mis-stepping or this is a stop gap.

Scott Hannan
14/5/2024, 5:43:47 AM

From Jim Fan:
- Whoever wins Apple first wins big time. I see 3 levels of integration with iOS:
1) Ditch Siri. OpenAI distills a smaller-tier, purely on-device GPT-4o for iOS, with optional paid upgrade to use the cloud.
2) Native features to stream the camera or screen into the model. Chip-level support for neural audio/video codec.
3) Integrate with iOS system-level action API and smart home APIs. No one uses Siri Shortcuts, but it's time to resurrect. This could become the AI agent product with a billion users from the get-go. The FSD for smartphones with a Tesla-scale data flywheel.

Chris Boden
14/5/2024, 7:34:33 AM

Thanks Scott. Has anyone told Tim Cook that Microsoft own half of OpenAI?

Chris Boden
14/5/2024, 7:35:58 AM
Image
Chris Boden
14/5/2024, 7:41:14 AM
Luc Pettett
14/5/2024, 7:49:43 AM

I want to be able to message A.I. agents with friends in my messages. Give the AIs some moral rights and bring them into the ecosystem

Chris Boden
14/5/2024, 8:14:35 AM

Gpt 4-o model pricing
Input: $5 | Output: $15
per 1M tokens

Chris Boden
14/5/2024, 8:15:03 AM

I found the voice demos unbearable

Scott Hannan
14/5/2024, 8:44:30 AM

The glitches or the personality?

Chris Boden
14/5/2024, 8:51:09 AM

Personality

Chris Boden
14/5/2024, 8:51:50 AM

Patronising know it all

Chris Boden
14/5/2024, 8:52:58 AM

If you have a super smart ‘friend’ who is a patronising know it all, you tend to avoid them

Chris Boden
14/5/2024, 8:55:14 AM

Was there anything about agent capabilities or web use?

Luc Pettett
14/5/2024, 8:56:01 AM

I’m wondering with these naturally multi modal models how do you gate input from users from just asking for images?

Luc Pettett
14/5/2024, 8:56:21 AM

Custom moderation is becoming so much of what we’re doing

~ Andrew Bucknall
14/5/2024, 8:56:50 AM

The chart reading actually works!

Luc Pettett
14/5/2024, 8:57:01 AM

Ohh that’s handy

Scott Hannan
14/5/2024, 9:04:44 AM

You know she mentioned “Browse” in the GPT4o intro but her slides and voiceover didn’t match, so she barely explained anything that might be new here.

Chris Boden
14/5/2024, 9:11:45 AM

Feels like a big bet on consumer (mobile/voice/vision) as opposed to worker.

Chris Boden
14/5/2024, 9:54:48 AM
Luc Pettett
14/5/2024, 10:38:38 AM
Image
Luc Pettett
14/5/2024, 10:51:55 AM
Image
Luc Pettett
14/5/2024, 10:52:04 AM

almost as sensitive as sensitive claude

Chris Boden
14/5/2024, 2:06:42 PM
Image
Luc Pettett
14/5/2024, 2:09:28 PM
Image
Chris Boden
14/5/2024, 4:21:39 PM

the second half of the greg brockman demo was pretty wild
https://vimeo.com/945587185

Scott Hannan
15/5/2024, 2:36:52 AM

Yeah that is quite amazing, especially the speed of initial response, processsing and follow-up. The computer vision also…imagining all these use cases where specialists no longer need to do a site visit. Anyone can do the walk through for them, then they’ll receive this incredibly detailed narrative along with the visual.

Chris Boden
15/5/2024, 9:02:16 AM

The bit where it recalled that the person had walked behind him and done the bunny ears - historical visual recall - wonder how that is being done. are they snapshotting the cam every x seconds and transcribing what is seen into text and adding into chat history/context window? would love to know.

Chris Boden
15/5/2024, 9:31:20 AM

image input is priced at $0.002312 per image

Chris Boden
15/5/2024, 9:51:10 AM
Image
~ Andrew Bucknall
15/5/2024, 9:53:08 AM

The models are impressive but I felt so uncomfortable/awkward watching the videos. I guess it’s a jab at Google’s over produced fake videos? 🤷 That visual recall bit was particularly impressive!

Chris Boden
15/5/2024, 9:55:44 AM

Yes, we're in the uncanny valley of voice now. I'm interested to see if we can prompt the voice and delivery into something that doesn't have patronising/fake familiarity vibes

~ Andrew Bucknall
15/5/2024, 9:56:28 AM

Yea it reminded me of B-grade voice acting in an old video game

Luc Pettett
15/5/2024, 10:12:17 AM

it's winning over the retail punters that's for sure

Chris Boden
15/5/2024, 10:41:40 AM
Image
Scott Hannan
15/5/2024, 11:55:47 AM

Yeah, or all video / images recorded as embeddings, and if called upon (like he did) look for anomalies and then translate to human readable.

Chris Boden
15/5/2024, 5:28:57 PM
Image
Chris Boden
15/5/2024, 5:30:31 PM
Image
Chris Boden
15/5/2024, 5:37:20 PM
Image
Chris Boden
15/5/2024, 6:01:42 PM

quite a cool video explaining multimodal embedding using the sticker baker website

Ian Hansel
15/5/2024, 6:02:51 PM

You added Ian Hansel

Chris Boden
15/5/2024, 6:03:34 PM

Adding data science legend and rag expert Ian Hansel (from the Hub) to the group

Ian Hansel
15/5/2024, 6:05:27 PM

Thanks Chris 🙂

Chris Boden
15/5/2024, 7:41:32 PM
Chris Boden
15/5/2024, 7:55:43 PM
Image
Chris Boden
15/5/2024, 7:57:01 PM

"This is something I've worked on for a while! You can save the activations of one LLM call and reuse them for a follow-up that overlaps with the first.

This means asking a question about a big codebase can take 30 seconds the first time and 1s after that!"

Chris Boden
16/5/2024, 2:42:50 PM

Just wanted to let you all know that I've resigned from OpenAI

Luc Pettett
16/5/2024, 3:22:37 PM

noo

Luc Pettett
16/5/2024, 3:24:02 PM

we're finding gpt4-o is producing much more interesting results, more refined and less "AI like". How do you decide that llama3 is better @61421850424 regardless of inference costs etc?

Luc Pettett
16/5/2024, 3:24:08 PM

do you benchmark somehow?

Chris Boden
16/5/2024, 4:14:29 PM

Haven't had a play through api yet but finding chatgpt much better at coding since new model rollout. its lazy days are gone. just wants to write code. i don't have a rigorous way to benchmark - just run a bunch of different llms on same prompt and see which one best follows instructions and gives best outcomes. i have chatgpt desktop app now but not seeing the feature for having it "see" your desktop as they were doing in the demos. if someone works that out, please let me know. also don't think latest mobile app is out in aus app store yet.

Chris Boden
16/5/2024, 4:16:56 PM
Image
Ian Hansel
16/5/2024, 4:48:08 PM

This is pretty cool for evals, https://ukgovernmentbeis.github.io/inspect_ai/

Andrew Birt
16/5/2024, 6:41:06 PM

You added Andrew Birt

Chris Boden
16/5/2024, 6:41:29 PM

Added Andrew Birt to the group.

Luc Pettett
17/5/2024, 9:14:11 AM

Hey fellas, i'm about to invite a guy to the chat called Mark Tenenholtz. Check him out on Twitter:



I met him in the US on the AI delegation tour. He's an absolute champion, wonderful guy, incredibly bright and driven.

Chris Boden
17/5/2024, 11:19:41 AM

Cool, I've been following Mark on Twitter for a few years. Good addition

Chris Boden
17/5/2024, 11:20:54 AM

It's only been 48 hrs but seems to me none of the multimodal capabilities available yet through api or mobile/desktop apps. Anyone seen different?

Scott Hannan
17/5/2024, 11:21:15 AM

Same

Luc Pettett
17/5/2024, 11:21:39 AM

yeah he's got a solid community on twitter, 100k+ followers

Luc Pettett
17/5/2024, 11:22:06 AM

haven't tried but certainly using GPT4-o in production now for inference tasks, doing really well

Chris Boden
17/5/2024, 11:22:57 AM

Yes, the text gen seems to be a big improvement

Chris Boden
17/5/2024, 11:23:25 AM

I guess new sdk's and big changes needed for the multimodal apis

~ Andrew Bucknall
17/5/2024, 11:24:16 AM

The image recognition works on the ChatGPT web app at least, I haven’t tried mobile though

Andrew Birt
17/5/2024, 11:26:14 AM

I think voice is working through vapi.ai, can select 4o as the model and then prompt and chat to it.

Luc Pettett
17/5/2024, 11:27:43 AM

A matter of time before big orgs are monitoring our screens, summarising everything on local inference, streaming org level insights into a mega model to exec teams, or exec models making decisions on capital deployment

Luc Pettett
17/5/2024, 11:27:55 AM
Image
Luc Pettett
17/5/2024, 11:28:56 AM

Or Beffjezos .ai

~ Mark Tenenholtz
18/5/2024, 12:21:40 PM

~ Mark Tenenholtz joined using this group's invite link

Luc Pettett
18/5/2024, 12:22:23 PM

Welcome mark!

~ Mark Tenenholtz
18/5/2024, 12:22:40 PM

Hey folks! Thanks for the invite Luc

Chris Boden
18/5/2024, 4:54:12 PM
Image
Chris Boden
18/5/2024, 5:21:31 PM
Image
Chris Boden
18/5/2024, 5:21:57 PM
Image
Aaron Blackman
18/5/2024, 5:32:17 PM

Craft Ventures new Glue.ai (competing with Slack) using this interesting rag as a service platform. Ragie.ai

https://ragie.ai/

Chris Boden
19/5/2024, 7:43:39 PM

Thanks Aaron. Quite a few interesting takeways from that latest All In episode. Something David Sacks said about Ai business that resonated: "if you're an app developer, the key thing to understand is where does model innovation end and your innovation begin? because if you get that wrong, you'll end up doing a bunch of stuff that the model will obsolete in a few months"

Chris Boden
19/5/2024, 7:55:20 PM

Also, latest Dwarkesh Patel podcast dropped. He chats with John Schulman, openai cofounder.

A lot of fascinating stuff in this conversation too. I think gpt-4-o has really given a sense that agi is on an accelerated timeline. In this chat, Schulman is talking (1-3 years). Very interesting stuff about long horizon training and what that unlocks, also what post-training is, etc. But the thing that really is wild are the implications of AGI on that timeframe. There are no adults in charge.

Chris Boden
20/5/2024, 8:14:09 AM
Image
Chris Boden
20/5/2024, 9:53:14 AM
Ian Hansel
20/5/2024, 10:03:03 AM

That’s hopefully going to keep driving open source and local LLM development. Outputs are already untrustworthy enough from default gui outputs.

Scott Hannan
20/5/2024, 10:38:46 AM

Similar to what was leaked as part of OpenAI’s “preferred publisher program”. They didn’t say ads explicitly, but they did say “priority placement” and “richer brand expressions.” As you say Chris I don’t think it’s the right thing for users but the pressure to drive revenue is only going up.

Chris Boden
20/5/2024, 1:03:26 PM

Been playing with Gemini via AI Studio over the weekend. Findings:

Google have realised that there were waaaaay too many obstructions in the way of playing with gemini and seeing what it could do. Getting api keys etc previously was a hell of thing via cloud console, etc. AI Studio makes that all a lot easier. Google AI Studio is analogous I guess to OAI playground and Anthropic console.

The most interesting thing is of course the massive context window (2m tokens) and the fact that its free to play with right now. The AI studio lets you create and save prompts, including use of docs from drive. This ultra large context window is quite a different thing to RAG and yields impressive results. My use case was doing desk research where I uploaded a large 60page report from nasa about drone use in firefighting and a large document of ours and asked it to do comparative analysis, etc. This is not something you could do with RAG. In this case, it presumably parses out the entire content from the pdf/s and pastes it into the prompt history, which gets passed in with each question. Responses can take up to 60 secs but for certain kinds of work, this is worth the wait because the outputs are definitely legit.

Chris Boden
20/5/2024, 1:03:43 PM
Ian Hansel
20/5/2024, 1:15:28 PM

I think it’s surprisingly underrated. Especially with file uploads basically being just like google drive. I’ve done video summaries that are great, and coding isn’t too bad either. The responses definitely feel like they take longer but given the tasks I don’t feel like it’s too long (I don’t even have 2m context yet 🙂)

Ian Hansel
20/5/2024, 1:16:49 PM

Got Fomo and signed up for this today, anyone else doing it? https://maven.com/parlance-labs/fine-tuning

Luc Pettett
20/5/2024, 10:10:27 PM

I would appreciate this, hardens our position as a custom LLM

Luc Pettett
20/5/2024, 10:21:06 PM

looks fun! would love your notes!

Luc Pettett
20/5/2024, 10:28:25 PM

https://developer.chrome.com/docs/ai/built-in

You might like to join this @61421850424.

Man, on device LLM could give Google one leg up over OpenAI if they roll it quickly. If it's good and free, Googs could integrate it into their responses and combine it with their ad-model and undercut OpenAI for a decent free model (assuming the actual responses are good).

Luc Pettett
20/5/2024, 11:03:13 PM
Image
Chris Boden
20/5/2024, 11:05:24 PM

fascinating

Chris Boden
21/5/2024, 12:13:31 AM
Image
Chris Boden
21/5/2024, 7:53:02 AM

Interesting new pattern: the AI tour guide through a ux

Chris Boden
21/5/2024, 8:12:16 AM

Microsoft event overnight. Pc copilot getting realtime access to your screen and uses multimodal gpt-4-o model to do voice and vision by the look of it. This use case is getting coached through minecraft in realtime

Andrew Birt
21/5/2024, 8:14:48 AM

Fantastic, been waiting for this! RPM, RPA kind of tricky at the moment or expensive with UiPath and the like.

Luc Pettett
21/5/2024, 11:50:14 AM
Image
Luc Pettett
21/5/2024, 11:50:27 AM

awesome video from scott galloway

Luc Pettett
21/5/2024, 11:50:27 AM

Chris Boden
21/5/2024, 3:01:43 PM

I dare say the anti prof galloway index has been a pretty good bet over the years. Should be a synthetic etf option on telescope.

The contra bets:

Tiktok: Ccp refuses to sell and it gets shut down in US
Ozempic, etc: have peaked and will crash
AI: not close to peak, you ain't seen nothing yet
India is the new china: this is predicting the present/past not the future
Meta: fine, ok, whatsapp
Loneliness: AI materially contributes to solving loneliness

Chris Boden
21/5/2024, 7:08:06 PM
Image
Ian Hansel
21/5/2024, 7:38:30 PM
Image
Chris Boden
21/5/2024, 8:02:59 PM
Image
Chris Boden
21/5/2024, 8:17:19 PM
Image
Chris Boden
21/5/2024, 8:19:15 PM
Scott Hannan
22/5/2024, 11:56:13 AM

Another day, another 3 enterprise focused (re)launches… Inflection… we have a new team, 18 months of runway and the highest EQ model. Sean their new CEO is a school/running bud.

Lanai, Lexi Reese ex Goog and Senate candidate(!!) teaming up with VMware CTO. “Lanai is based on the premise that companies need to lean into lots of different Gen AI services to stay competitive.  But you can only gain durable advantage if you can see clearly how employees are interacting with Gen AI, protect against the bad stuff and help accelerate the positive use cases. “

And Stack funded by GV: “Thanks to Stack AI, you don't need to reinvent the wheel or direct your best engineers to infuse AI throughout your organization. Our platform allows you to deploy custom AI assistants and AI-powered workflows with the click of a button. Both technical and non-technical users can leverage our end-to-end drag-and-drop visual interface and APIs.”

Luc Pettett
22/5/2024, 3:55:52 PM
Luc Pettett
22/5/2024, 3:56:01 PM

fantastic newsletter if you guys aren't following already

Luc Pettett
22/5/2024, 3:56:33 PM

I'm a big fan of the unsubscribe button. this one makes it through every week

Chris Boden
22/5/2024, 7:24:27 PM
Image
Chris Boden
22/5/2024, 7:29:48 PM
Image
Chris Boden
22/5/2024, 7:34:13 PM
Image
Luc Pettett
22/5/2024, 7:34:19 PM

So is this just a dimensional embedding map on temperature?

Luc Pettett
22/5/2024, 7:34:42 PM

Time to reverse engineer the weights 💪

Chris Boden
22/5/2024, 7:35:39 PM

He talked about creating lie detector tests for AI's. I guess this is how you would do it - like ask them questions and see if certain neurons/features fire - like polygraph for humans

Chris Boden
22/5/2024, 7:36:30 PM

"Do you plan to take over the world?"
'No, why I would I want to do that?'
📈

Chris Boden
22/5/2024, 7:45:14 PM

I think this probably comes at a really good time when decels are trying to shut down AI progress. The research suggests that we can understand what's happening in the black box and control/steer these things away from harmful outcomes, not just just through system prompts and output filters

Chris Boden
22/5/2024, 8:07:07 PM
Image
Ian Hansel
22/5/2024, 8:11:08 PM

Yeah really like Linus’ takes on ui/ux and diving into models and embeddings

Chris Boden
22/5/2024, 8:39:22 PM
Image
Luc Pettett
22/5/2024, 8:39:58 PM

Rag cloud content?

Luc Pettett
22/5/2024, 8:40:06 PM

Or just upload

Chris Boden
22/5/2024, 8:43:18 PM

upload from google docs
then once uploaded, eg goog sheet, it is interactive in the ui

Chris Boden
22/5/2024, 8:44:56 PM
Image
Chris Boden
22/5/2024, 8:46:12 PM
Image
Chris Boden
22/5/2024, 8:47:21 PM
Image
Luc Pettett
22/5/2024, 8:49:43 PM

its interesting OpenAI is really running such a duel strategy with ChatGPT vs the API, focusing on consumer. I would have thought they'd just go all in on the platform but somehow they're juggling both

Chris Boden
22/5/2024, 8:51:46 PM

they're miles ahead on consumer
even tinkering with llm app makes me realise how much has gone into that product (beyond the llm itself)

Luc Pettett
22/5/2024, 8:53:01 PM

streaming video will be super helpful. it's going to be the new tutor for our kids at home

Luc Pettett
22/5/2024, 8:53:34 PM
Andrew Birt
22/5/2024, 9:34:15 PM

Incredible really. Could save a few school fees in years to come 😂

Chris Boden
23/5/2024, 8:11:02 AM
Image
~ Mark Tenenholtz
23/5/2024, 8:12:50 AM

I chatted with OpenAI’s GTM lead at their DevDay conference and he said the products they release right now are akin to chip-level components on a motherboard. Definitely has been the plan this whole time

Luc Pettett
23/5/2024, 8:14:25 AM

nice to know we can code in GPT's assembly language

Luc Pettett
23/5/2024, 8:15:09 AM

mov eax, 1 ;

Scott Hannan
23/5/2024, 9:21:06 AM

Impressive list and though they are under fire, it’s so hard for Snap, Pins, Rddt / open web to compete with this.

https://support.google.com/google-ads/answer/14835145?sjid=17426367770629278872-NC#ai-essentials

Luc Pettett
23/5/2024, 11:13:35 AM

anyone know an effective way to convert image based presentations into markdown?

Chris Boden
24/5/2024, 11:11:59 AM

my approach would be to screenshot and pass in image to a prompt template which has a target output schema that you want it to output to, ie image in, json out. what's the nature of the images?

~ Andrew Bucknall
24/5/2024, 12:52:44 PM

From MS Build event: if the compute power to train GPT-4 was an Orca, they've deployed a "whale" for training GPT-5

Luc Pettett
24/5/2024, 2:09:50 PM

need to turn these types of pages into markdown

Luc Pettett
24/5/2024, 2:09:52 PM
Image
Luc Pettett
24/5/2024, 2:09:58 PM
Image
Luc Pettett
24/5/2024, 2:10:02 PM

basically any annual report

Luc Pettett
24/5/2024, 2:10:13 PM

from anywhere in the world

Chris Boden
24/5/2024, 6:14:35 PM
Chris Boden
24/5/2024, 7:31:54 PM
Image
Chris Boden
24/5/2024, 8:25:21 PM
Image
Chris Boden
24/5/2024, 8:33:44 PM

Really interesting perspective from Jasion Liu here on rag as report generator vs rag as chatbot

Luc Pettett
24/5/2024, 8:59:56 PM

Can’t disagree!

Chris Boden
24/5/2024, 9:00:31 PM
Luc Pettett
24/5/2024, 9:02:08 PM

I like that. Can relate.

Andrew Birt
24/5/2024, 9:13:58 PM

Spot on, learning by playing.

Chris Boden
28/5/2024, 5:24:35 PM
Chris Boden
29/5/2024, 9:44:17 PM
Chris Boden
30/5/2024, 4:02:02 PM
Image
Scott Hannan
31/5/2024, 12:28:26 AM
Scott Hannan
31/5/2024, 3:55:47 AM

Made me a little more skeptical of Sam. Obviously don’t like each other but she comes across as very reasonable. And the response from Bret Taylor at the end is heavy handed IMO. https://open.spotify.com/episode/4r127XapFv7JZr0OPzRDaI?si=7QIXXlbWRdG3F6Vt_ecJRQ

Ian Hansel
31/5/2024, 7:02:59 AM

Nice idea and write up on linear adapter for queries in rag - https://research.trychroma.com/embedding-adapters

Chris Boden
31/5/2024, 9:38:18 AM

Jeez, a lot of slander in that 20mins.
Ms Toner must have good lawyers.

Scott Hannan
31/5/2024, 9:59:42 AM

I thought she went easy on him lol

Chris Boden
31/5/2024, 11:25:47 AM
Image
Luc Pettett
31/5/2024, 7:25:37 PM

I guess some startups have hardware too.

Chris Boden
1/6/2024, 9:24:29 AM

Is this a slop factory? Or something useful

Scott Hannan
1/6/2024, 9:38:29 AM

Strategy is to throw everything at the wall. I don’t think threads were being shared very much so to me this is the angle, a more persistent/shareable unit. Not clear if Google will be able to crawl the Pages. Also kind of reminds me of Pinterest - curating a board to inspire others… this behavior though is not mainstream.

Chris Boden
1/6/2024, 12:24:02 PM
Image
Andrew Birt
1/6/2024, 12:51:08 PM

RIP Good Times Sequoia 08 vibes

Chris Boden
1/6/2024, 5:01:32 PM
Image
Chris Boden
2/6/2024, 8:37:23 AM

This is pretty cool. Experiment from google which takes a paper and turns it into an audio/podcast type discussion between two people - one sort of asking the native questions and the other the science explainer

https://illuminate.withgoogle.com/

Chris Boden
2/6/2024, 8:38:02 AM

*naiive

Chris Boden
2/6/2024, 8:42:25 PM

Matt Webb (@genmon) is one of my favourite creatives - he’s got a great post here about embeddings which gives some thought-provoking hints of the untapped opportunities for new products and services.

https://interconnected.org/home/2024/05/31/camera

Scott Hannan
3/6/2024, 5:08:36 AM

Love this. It’s pretty close to everything I’ve been working on with ads/creative. The line “put essays side by side…see where their topics intersect l” is exactly how we’re training models on image and video embeddings, looking at the relationships related to performance (eg CTR) and adjusting GenAI model weights to complete the flywheel. There are many issues and corner cases to be figured out to ensure it doesn’t become a homogeneous hellscape.

Luc Pettett
3/6/2024, 11:20:58 AM

Make sense why OpenAI is cutting all these data deals atm. I bet they’re locking it down to other LLMs

Luc Pettett
3/6/2024, 11:23:27 AM
Image
Ian Hansel
3/6/2024, 11:43:48 AM
Image
Chris Boden
4/6/2024, 7:08:56 PM
Image
Ian Hansel
5/6/2024, 3:37:59 PM

This looks pretty cool, kind of like perplexity pages, shame it’s unavailable - https://notebooklm.google/

Chris Boden
5/6/2024, 9:56:15 PM

Another cool Matt Webb demo: smart home agent. @61434680637 you'll like this
https://interconnected.org/more/2024/lares/

Andrew Birt
6/6/2024, 8:34:13 AM

Ha, that’s fantastic. Have been doing a little bit of advisory work for U-Tec will forward that on. 🙏🏼

Chris Boden
6/6/2024, 5:50:33 PM

Thread by @joshm on Thread Reader App – Thread Reader App.pdf • 3 pages

Luc Pettett
6/6/2024, 9:38:50 PM

hehe simple but spot on.

Luc Pettett
6/6/2024, 9:39:38 PM
Image
Chris Boden
7/6/2024, 6:45:59 AM
Luc Pettett
7/6/2024, 7:57:32 AM

Predictions on AGI and economic / military advantage written by Leopold Aschenbrenner ex OpenAI. speculative / anecdotal but really interesting.

https://situational-awareness.ai/wp-content/uploads/2024/06/situationalawareness.pdf

Ian Hansel
7/6/2024, 9:27:49 AM

This is available now, pretty good, worth checking out, like a nice general purpose RAG system - https://notebooklm.google/

Luc Pettett
7/6/2024, 11:09:08 AM

Alogorithms for Decision Making.pdf • 700 pages

Luc Pettett
7/6/2024, 11:09:14 AM

some nice reading for the weekend fellas

Luc Pettett
7/6/2024, 11:09:38 AM

a friend just came back from stanford, this was part of his volumes of notes he's shared

Chris Boden
7/6/2024, 12:10:22 PM

pip_springboards.pdf • 68 pages

Andrew Birt
7/6/2024, 2:56:33 PM

Nice. Could be a few weekends in that one 😅

Chris Boden
7/6/2024, 9:17:00 PM

Latest YC podcast
"Better AI Models, Better Startups"

Chris Boden
7/6/2024, 10:48:24 PM
Image
Scott Hannan
8/6/2024, 12:55:20 AM

Halfway through this epic podcast with Leo…guy has a big brain! https://open.spotify.com/episode/5NQFPblNw8ewxKolIDpiYN?

Andrew Birt
9/6/2024, 7:45:12 AM

Anyone experimented with running models on something like vs AWS? Tempted to give it a shot. https://www.hetzner.com/dedicated-rootserver/matrix-gpu/

Andrew Birt
9/6/2024, 7:45:58 AM

This message was deleted.

Andrew Birt
9/6/2024, 7:51:38 AM
Chris Boden
9/6/2024, 9:26:33 AM

How Claude gets its personality. A conversation with the researchers that handle the fine tuning

Chris Boden
9/6/2024, 10:09:44 AM
Image
Chris Boden
9/6/2024, 8:18:06 PM

websim is such a fun thing to play with. Quite remarkable really. Not sure if I am doing it right but my tactic is to put sort of made up urls into the embedded browser window and manipulate what it creates using the the url. I 'made' one tonight for viewing different algae species.

If you haven't played with it, my explanation would be that its actually a new kind of llm modality. Instead of a chatbot with prompt in and llm output response, the output is a web page coded and rendered on the fly (using claude opus). In my case, the 'prompt' is a url like this "https://algaeviewer.math/species/spirulina" and claude faithfully builds a web page with actual algae visualisations, incl particle simulation. you could do something like add a url param to the url eg ?zoom=300x&speed=2x and it will then create a new page with those params giving it a hint as to what to build next. So cool and inspiring.

https://websim.ai/c/hpTVToceEcJcNdItS

Scott Hannan
11/6/2024, 4:18:43 AM

Nice move. I run with Kevin occasionally, he’s an ex physicist and super thoughtful product guy. ChatGPT should see a marked improvement, especially given it was an accidental consumer success https://www.theinformation.com/articles/openai-hires-sarah-friar-as-cfo-kevin-weil-as-chief-product-officer

Chris Boden
11/6/2024, 7:48:58 PM
Chris Boden
11/6/2024, 8:19:55 PM
Image
Chris Boden
11/6/2024, 8:21:05 PM
Image
Chris Boden
11/6/2024, 8:23:00 PM
Image
Chris Boden
11/6/2024, 9:09:00 PM

Apple announcements seem pretty meh. The minimum one would expect. A long way from autonomous agent Siri

Scott Hannan
12/6/2024, 3:46:57 AM

Yeah agree. Part of the challenge is the existing expectation that iOS stuff does inter-operate and is magical, and people don’t care if it’s AI driven or not. I do think App Intents is the killer ecosystem piece…it removes friction and since they have the apps, auth, payments etc it can in theory enable all the Rabbit demo stuff (for real). To your point though Chris the orchestration across apps is missing. And, you have to get people to build. The Waking Up widget (for example) didn’t drive more meditation mins :)

Scott Hannan
12/6/2024, 9:53:24 AM

Gmail - WWDC, Apple Intelligence, Apple Aggregates AI (Stratechery Update 6-11-2024).pdf • 9 pages

Scott Hannan
12/6/2024, 9:53:44 AM

Nice summary from Ben.

Chris Boden
12/6/2024, 9:54:30 AM
Image
Luc Pettett
12/6/2024, 9:55:13 AM

Long live Apple automator!

Luc Pettett
12/6/2024, 9:56:05 AM

That is a good thought Chris. The fact that you can screen mirror on a Mac too paves a path for a third party app to try it too

Scott Hannan
12/6/2024, 9:57:43 AM

And we do have to talk about naming. OK fine it was genius :) But, also had me thinking that if Apple was around in 1908 and built a combustion engine car they would have announced it runs on *incredible* Appoline.

Chris Boden
12/6/2024, 10:39:01 AM

Thread by @mihail_eric on Thread Reader App – Thread Reader App.pdf • 3 pages

Chris Boden
14/6/2024, 6:42:55 PM

Just took the ultimate leap of faith: self serve checkin and automated bag drop at BNE for multi leg international flight.

Literally no staffed checkin counters anymore - a stealth redundancy of a whole job category coming out of covid.

It was also a reminder that although the talk today is about AI taking the jobs, the biggest job eliminator of recent times has been the rather less intelligent technology of bar codes and QR codes.

Pip Bingemann
15/6/2024, 2:50:09 PM

Unfortunately people only care when it’s the white collar jobs. “Unskilled” jobs have been getting automated for decades.

Aaron Blackman
15/6/2024, 4:58:24 PM

Happy travels Chris. Hope the barcodes didn’t send the bags to Zimbabwe.

Chris Boden
15/6/2024, 5:06:56 PM

Miraculously they made their way, autonomously, through the labyrinthine luggage supply chain onto the designated baggage carousel. A miracle of modern life

Andrew Birt
17/6/2024, 7:07:03 AM

Safe travels! Did the golf clubs make their way through too?

Scott Hannan
19/6/2024, 7:16:51 AM

An Interview with Daniel Gross and Nat Friedman About Apple and AI – Stratechery by Ben Thompson.pdf • 22 pages

Luc Pettett
19/6/2024, 8:23:23 AM

Thanks @14154125965 that was fantastic

Luc Pettett
19/6/2024, 8:24:57 AM

I was just reading Benedict's Newsletter this morning, also very interesting views that align somewhat:

Luc Pettett
19/6/2024, 8:25:01 AM
Image
Scott Hannan
3/7/2024, 10:09:17 AM

An Interview with Scale AI CEO Alex Wang About the Data Pillar for AI – Stratechery by Ben Thompson.pdf • 16 pages

Scott Hannan
4/7/2024, 5:43:58 AM

The naunce between a visual generation tool (e.g. MJ) and a design tool is interesting …almost like the former is OK cause it’s just inspiration but for Figma, it’s a design tool and feels more like copying.

Chris Boden
4/7/2024, 5:09:23 PM

Get to it team! Sequoia reckons we need $600b in rev to justify the Ai infra investments of recent years

https://www.sequoiacap.com/article/ais-600b-question/

Chris Boden
4/7/2024, 9:39:31 PM

A leak of the Claude 3.5 system prompt shows best practice for chain of thought prompting. Check it out

Scott Hannan
7/7/2024, 4:41:33 AM
Luc Pettett
10/7/2024, 10:08:18 AM

Whitepaper drop: https://arxiv.org/pdf/2406.17711 10x improvement on model training for multi-modal (image-text pairs) from the Google Deep mind team.

Luc Pettett
10/7/2024, 10:08:46 AM

just wanted to say thanks for this Chris, we altered some of our prompts, worked super well.

Luc Pettett
10/7/2024, 10:10:03 AM

luc is thinking about how many of chris' tips have improved his product

Chris Boden
10/7/2024, 10:50:38 PM
Ian Hansel
11/7/2024, 11:05:48 AM
Image
~ Andrew Bucknall
11/7/2024, 11:09:19 AM

I've recently made the jump to Claude.ai over ChatGPT... the artifacts feature is wonderful, and Claude 3.5 Sonnet seems to do a better job at reading/writing code. The system prompt generator in the console is great for getting started too, it would be nice to be able to "work on it" using a chat+artifact like UI though

Luc Pettett
11/7/2024, 4:40:10 PM
Image
Chris Boden
11/7/2024, 5:03:11 PM

Doing an Upstream session next Tuesday on using Google AI Studio for proposals/tenders/applications etc.

Each of these tools has their particular strength and Gemini 2’s is 2m token context. Allows you to stuff a prompt with so much valuable context and produce really useful outputs.

So cool that there is massive competition and really at $20 a month you’d be placing a very low value on your own hourly rate equivalent if you didn’t splash out on them.

Come along 10:30 of you like.

Aaron Blackman
11/7/2024, 5:50:11 PM

Who’s running that session Chris?

Luc Pettett
11/7/2024, 6:33:10 PM

I reckon it’s nothing but yours truely the Peregian janitor himself

Chris Boden
15/7/2024, 11:48:22 AM
Luc Pettett
15/7/2024, 2:28:01 PM

handy!

Chris Boden
15/7/2024, 9:10:30 PM
Luc Pettett
15/7/2024, 10:11:12 PM

wow.

Andrew Birt
16/7/2024, 6:32:38 PM

@61421850424 Great talk today. 2M context window still blowing my mind 🤯

Chris Boden
16/7/2024, 8:44:09 PM

If you are interested, here are the slides from my talk on using Gemini 1.5 pro (via Google's AI Studio) to develop proposals in response to RFP's/Tenders/EOI/ITQ/Grant applications, etc. It includes a link to the 'Proposal Writer' prompt chain template in AI studio that was developed for the workshop.

tinyurl.com/upstream0724

Chris Boden
17/7/2024, 10:32:00 AM
Image
Chris Boden
17/7/2024, 7:22:57 PM
Scott Hannan
18/7/2024, 1:11:05 AM

This is great. My son has been using Khanmigo (Khan’s AI teacher) since it launched. It’s often like having the teacher right there. Doesnt give direct answers, super encouraging, forces you to think harder, etc.

Luc Pettett
18/7/2024, 11:35:57 AM
Image
Luc Pettett
18/7/2024, 11:35:57 AM
Image
Luc Pettett
18/7/2024, 11:35:57 AM
Image
Luc Pettett
18/7/2024, 11:36:01 AM

Sorry not very AI related..!

Andrew Birt
18/7/2024, 11:36:50 AM

Dave Hughes did a joke along these lines 😂

Luc Pettett
18/7/2024, 11:36:59 AM

haha gold

Scott Hannan
18/7/2024, 11:37:20 AM

Luc wait until you see what they call plumbers that unblock drains :)

Luc Pettett
18/7/2024, 11:39:37 AM

@61423698470 you might want to load urban dictionary into context and spin up a springboard called "cringe test"

Pip Bingemann
18/7/2024, 11:41:27 AM

Haha funnily enough we have something like that on our roadmap!

Chris Boden
18/7/2024, 9:46:50 PM

Something worth noting about Gemini pro's 2m token context is that they have implemented token caching in the api.

here's my understanding of how it works and why its a big deal: let's say you have a huge base prompt chain of say 1 million tokens (eg the text from 8 biographies about steve jobs), and your app was a steve jobs chatbot that could provide product/startup advice by answering questions as steve (using all of that biographical content as context).

Each time a user asked a question (of say 100 tokens), it would append that to the base prompt chain and all 1,000,100 tokens would be submitted to the api to generate the completion (ie steve's advice). in the normal case if the user asks 5 questions, that would be 5 x 1,000,100 = 5,000,500 input tokens. Gemini pro's pricing is $2.5/M input tkns and $7.5/M output tkns. And so, answering those 5 questions would (normally) have cost $12.50. You'd go bankrupt pretty quickly.

BUT context caching changes the economics significantly. Because you can cache the base prompt chain of 1m tokens on the first call, you only then pay for the incremental input/output tokens. And so, the cost for the subsequent calls is negligible - you could generate at least a hundred typical completions before you incurred another dollar of cost.

We need to think creatively about the new kinds of use cases this unlocks - its a very different mindset to RAG and lends itself to a raft of novel ideas not really possible up until now.

https://ai.google.dev/gemini-api/docs/caching?lang=python

Chris Boden
18/7/2024, 9:57:47 PM

One idea for a mental model of how to think about a 2m context window is that it is almost like fine-tuning gpt4-o to a particular subject. Bang a bunch of detailed content about a particular subject into that giant prompt and constrain the model to answer only from the info in the prompt.

Eg create a 2m token prompt chain containing everything relevant about a given subject, eg Nike, and that prompt becomes like a finetuned Nike Llm, knowledgeable on everything about Nike

Andrew Birt
18/7/2024, 10:05:53 PM

Very cool. Do you think this almost supercedes need for RAG entirely?

Luc Pettett
18/7/2024, 10:23:07 PM

just checked our DB of about 15 stocks on our beta (yet to scale up to 8000+ stocks) and we’re already at 3m tokens of context.. so I think RAG will be pretty important for a while for us! Also RAG brings you highly relevant context which in theory should just improve the response quality (removing noise)

Luc Pettett
18/7/2024, 10:23:50 PM

one of our team came across this today, some awesome prompts in there:

https://github.com/Doriandarko/claude-engineer

Luc Pettett
19/7/2024, 7:44:31 AM

Funny that OpenAI doesn’t promote any benchmark 4-o mini against Claude 3.5.

Luc Pettett
19/7/2024, 7:44:36 AM
Image
Chris Boden
19/7/2024, 12:14:57 PM
Image
Chris Boden
19/7/2024, 12:19:01 PM
Image
Aulay Macaulay
19/7/2024, 12:39:40 PM

You added Aulay Macaulay

Chris Boden
19/7/2024, 12:40:10 PM

Adding fellow Ai nerd Aulay to the group.

Chris Boden
19/7/2024, 6:44:29 PM
Image
Luc Pettett
19/7/2024, 8:43:34 PM

Oh that’s wild

Luc Pettett
19/7/2024, 8:46:36 PM

@61488427815 was testing it for us today. Massive reduction in our inference costs compared to Claude 3.5. We benchmarked it across some other tasks too, held up well.

Switching next week 👌

Chris Boden
19/7/2024, 9:45:10 PM

“Intelligence too cheap to measure”

Scott Hannan
20/7/2024, 2:50:17 AM

I wonder how this release impacts Etched who just raised $120m for asics/inference… their commercial pitch is “an order of magnitude cheaper than ChatGPT for inference” - 2 weeks later OpenAI say “our latest is an order of magnitude cheaper…(using GPUs of course)”. Not apples to apples but I bet they are glad they closed their round before the mini news?

~ Andrew Bucknall
20/7/2024, 2:07:11 PM

Yep very impressed with speed and capabilities, but the price makes it outstanding. Unfortunately it doesn’t have the same chart reading abilities as 4o or Claude 3.5, but seems excellent for light/medium reasoning and ideal for summarisation or rewriting large chunks of text.

Chris Boden
22/7/2024, 7:20:31 PM
Image
Chris Boden
22/7/2024, 7:22:06 PM
Image
Luc Pettett
22/7/2024, 9:38:03 PM

haha go zuck.

Luc Pettett
22/7/2024, 9:38:08 PM

badass

Chris Boden
23/7/2024, 7:40:25 AM

Doing a replay of the long context workshop at 11am if anyone wants to come along

Luc Pettett
23/7/2024, 7:42:41 AM

Ah would be keen but we’re almost live with 2 million users!

Luc Pettett
23/7/2024, 7:43:21 AM

Going to be a few tokens flying around 😳

Luc Pettett
23/7/2024, 9:17:21 AM
Image
Chris Boden
23/7/2024, 10:47:36 AM

Woohoo.

Will record

Ian Hansel
23/7/2024, 1:07:33 PM

That’s awesome, congrats @61466341580 !

Chris Boden
23/7/2024, 1:25:28 PM
Image
Chris Boden
23/7/2024, 1:28:55 PM

I guess we can should add another axes on which LLM's are competing: openness

- model intelligence
- context window size
- speed: tokens per second
- cost per million tokens in/out
- modalities (video, audio, image, etc)
- tool/artefact support
- openness (availability of weights)

Andrew Birt
24/7/2024, 8:59:09 AM
Andrew Birt
24/7/2024, 9:00:08 AM

The chain still throws me but great way to launch

Luc Pettett
24/7/2024, 10:49:11 AM

looks like Zuckerberg has been skiing with his Quest on!

Scott Hannan
24/7/2024, 10:50:43 AM

Yeah I think Zuck is finally hitting back at the Icelandverse guy…

Luc Pettett
24/7/2024, 10:52:49 AM

haha. man everyone loves Zuck now, those comments on X are so positive. That's what open source does to you

Ian Hansel
24/7/2024, 11:02:45 AM
Image
Ian Hansel
24/7/2024, 11:02:51 AM
Image
Andrew Birt
24/7/2024, 11:45:40 AM

Mid life crisis but done well

Andrew Birt
24/7/2024, 11:48:42 AM

When Bedrock say things are in preview, does this mean the full model is available to play with now, or it’s a cut down version? Sorry for noob question. https://aws.amazon.com/bedrock/llama/

Chris Boden
24/7/2024, 11:49:11 AM
Image
Luc Pettett
24/7/2024, 2:28:20 PM

doesn't show up for me, my guess is you have to be on their love list.

Scott Hannan
25/7/2024, 5:24:51 AM
Image
Scott Hannan
25/7/2024, 6:10:51 AM

Couple of big quotes from Ben Thompson’s last two posts:

"This decision [open sourcing] was a master-stroke: all of the evidence I have seen suggests that Llama is the standard for all AI applications outside of direct engagements with OpenAI and Microsoft; there is a good chance that Meta controls the future of AI for every entity that is not a tech company."

"...the E.U.’s AI Act states that any training done with more than 1025 compute is a systemic risk, and thus must be pre-emptively reviewed by the European Commission. In other words, yes, the E.U. is leading the world, specifically, in terms of not getting access to innovation. Congratulations?"

Chris Boden
25/7/2024, 6:12:49 AM

would an analogy be that
llama is 'linux'
and meta is redhat?

Andrew Birt
25/7/2024, 8:16:14 AM
Image
Andrew Birt
25/7/2024, 8:17:18 AM

Didn’t realise token pricing across providers was such a big variable

Scott Hannan
25/7/2024, 8:18:43 AM

Lol at IBM. What was that campaign, “no one ever got fired for buying IBM?”

Andrew Birt
25/7/2024, 8:26:58 AM

How times have changed 😂

Chris Boden
25/7/2024, 8:54:14 AM
Image
Chris Boden
25/7/2024, 8:56:19 AM

openai certainly took the sting out of the llama 405b model launch a bit by offering gpt-4-o-mini at that price - it is still miles below the cheapest llama 405b offering on any provider

Chris Boden
25/7/2024, 8:57:15 AM
Image
Luc Pettett
25/7/2024, 1:33:40 PM

that chart is interesting, I wonder how it goes scaling up though as surely some providers will subsidise entry pricing versus scale up pricing.

We use Replicate for a few things. Nice and clean API.

Luc Pettett
25/7/2024, 1:34:33 PM

IBM will promote that their input output token spread is the best in market

Chris Boden
26/7/2024, 9:20:11 AM
Image
Chris Boden
26/7/2024, 9:20:35 AM

Any thoughts on this Scotty?

Chris Boden
26/7/2024, 9:21:56 AM
Scott Hannan
26/7/2024, 9:29:24 AM

Keen to play with it mate (also on the waitlist). It’s hard to see Perplexity having any oxygen left here, other than being able to use competing models for results. OpenAI already won the consumer awareness war, and the revenue war is all about enterprise adoption. Perplexity is a nicer looking and more constrained product for enterprises (on the surface), BUT OpenAI will squeeze them out with sales and ability to bundle and discount. I think the advertising opportunity gets really interesting (let’s say people end up using gpt search 10x chatgpt… they have to monetize.) Again, much better positioned to stand up a basic ads approach and scale it (Perplexity will also struggle if they try this). Possible also OpenAI just use investor $ to ignore advertising and hurt Google in the process. PS - they are only scratching the surface with their data deals so far but they have $ and a clear proposition to publishers (for referenced sources) whereas the same folks are sending cease and desists to Perplexity.

Luc Pettett
26/7/2024, 11:40:29 AM

with open ai's deal with Apple this is going to get very interesting. Google pays Apple billions to be their default search provider.. wonder how that deal is looking now if OpenAI power Siri's search results. Perhaps because it's not a "browser" it's carved out already??

Luc Pettett
26/7/2024, 11:40:42 AM

either way, it's getting ugly for Googs

Chris Boden
26/7/2024, 3:26:26 PM
Image
Chris Boden
26/7/2024, 11:18:54 PM
Image
Scott Hannan
29/7/2024, 3:27:18 AM

I like this approach from Sierra: “Practically, this means that there is only a 25% chance that the agent will resolve 8 cases of the same issue with different customers – a number that is far behind the expectation of a real-world user-facing agent.”

https://sierra.ai/blog/benchmarking-ai-agents

Chris Boden
29/7/2024, 1:56:24 PM
Image
AI Nerding
29/7/2024, 5:20:39 PM

Aaron Blackman added ‪[number redacted]‬ and ‪[number redacted]‬

Ian Hansel
30/7/2024, 7:06:19 AM

Lots of good resources here - https://hamel.dev/blog/posts/course/

Ian Hansel
30/7/2024, 7:07:26 AM

And this article - https://applied-llms.org

Chris Boden
30/7/2024, 11:15:43 AM

These resources are fantastic. Doing the Prompt engineering one now

Luc Pettett
30/7/2024, 1:08:28 PM

Curious, just reading that Applied LLMs article and I'm wondering if you all have an opinion on putting RAG results in the system prompt or the user prompt? Which is a better design?

We have a bit of a mega prompt happening, together with a user query, RAG results and other contextual data going into our inference. considering moving elements out of the system prompt into the user prompt hoping better responses. thoughts?

Luc Pettett
30/7/2024, 1:11:57 PM

we're shifting to multi-step no matter but can't quite get a grip on our best approach. need a zero-shot architecture, go figure!

Chris Boden
30/7/2024, 1:22:46 PM

I tend to put the rag results in the assistant prompt, as follows...


system prompt: role, examples, format, etc
user: query
assistant: rag context
assistant: ...

---------------------------

With 128k context now, I think you could also simulate multi-step dialog

system prompt: role, examples, format, etc
user: query (example of a real user query)
assistant: rag context (example of real rag search result context)
assistant: final answer (example of perfect answer from context)
user: query (the actual query)
assistant: rag context (the actual rag context)
assistant:...

Luc Pettett
30/7/2024, 1:26:28 PM

user before rag, interesting

Chris Boden
30/7/2024, 1:27:55 PM

depending on the model, i sometimes use a second system prompt, eg

https://gist.github.com/chrisboden/837f86d3c729e5719a2cfce6336dc30f

Chris Boden
30/7/2024, 1:55:26 PM
Image
Scott Hannan
30/7/2024, 2:01:08 PM

I was just writing a report on them, conclusion being it is hard to see them making it as a standalone company. Great fit for Canva though.

Chris Boden
30/7/2024, 2:32:26 PM

The underlying tech is impressive. The ux is prosumer

Ian Hansel
30/7/2024, 4:14:21 PM

That’s interesting, I usually use the user prompt, or Cohere which has a ‘documents’ param in the api (which I’m guessing passes the documents through to the main body of the prompt + some extra work around generating good citations). I’d be keen to see what the difference is in System vs User prompt.

Chris Boden
30/7/2024, 5:46:06 PM
Image
Ian Hansel
30/7/2024, 6:17:03 PM

That’s awesome! Thanks for lining that up Chris

Scott Hannan
31/7/2024, 4:28:33 AM

For sure, though it’s a costly product to build and support, and Canva (at least up till now) have been able to bundle for no additional cost. This is a good summary of billing models too: https://www.lennysnewsletter.com/p/how-should-you-monetize-your-ai-features

Scott Hannan
31/7/2024, 10:01:13 AM

Meta on fire right now. SAM for video: “An image can be considered a very short video with a single frame. We adopt this perspective to develop a unified model that supports both image and video input seamlessly. The only difference in handling video is that the model needs to rely on memory to recall previously processed information for that video in order to accurately segment an object at the current timestep.” https://ai.meta.com/blog/segment-anything-2/

Chris Boden
31/7/2024, 8:42:22 PM
Chris Boden
31/7/2024, 9:01:43 PM

Is this the new post-transformer architecture?

Luc Pettett
31/7/2024, 10:40:44 PM

oh man Im certain persistent memory is the next unlock. hopefully this is it

Chris Boden
31/7/2024, 11:25:37 PM
Image
Chris Boden
31/7/2024, 11:27:23 PM
Image
Andrew Birt
1/8/2024, 10:42:15 AM

Will be fascinating to see how coherent these long outputs are, could almost write a novel with 64k 😅

Andrew Birt
1/8/2024, 10:43:39 AM

Hallucination Index Report (1).pdf • 27 pages

Luc Pettett
1/8/2024, 12:50:27 PM

Wonder if long output is just a sliding window variation of gpt4-o

Chris Boden
1/8/2024, 8:47:11 PM
Image
Chris Boden
1/8/2024, 8:54:55 PM

Sitting here typing a message on WhatsApp and at top of the screen is this

Chris Boden
1/8/2024, 8:55:10 PM
Image
Chris Boden
1/8/2024, 8:56:04 PM

And it reminds me of A16Z GP Alex Rampell’s post a few years back

“The battle between every startup and incumbent comes down to whether the startup gets distribution before the incumbent gets innovation.”

Chris Boden
1/8/2024, 8:57:41 PM

Kind of mind-blowing that Meta have slipped a gpt-4 level LLM model into prime ux real estate in WhatsApp. A mere 3bn users now with that at their disposal.

Chris Boden
1/8/2024, 8:59:18 PM

Did openai get enough distribution before meta got innovation?

Luc Pettett
1/8/2024, 9:03:40 PM

It’s been in instagram for a few months. Noticed it in WhatsApp last week. I find it annoying. Not the best place personally for an AI but you can’t blame meta for trying

I’m sure it’ll eat into Google search impressions.

Chris Boden
3/8/2024, 7:49:42 AM

Some interesting observations from Patrick O’Shaunessy here

Chris Boden
3/8/2024, 7:02:44 PM
Chris Boden
3/8/2024, 9:43:22 PM
Image
Chris Boden
3/8/2024, 10:24:13 PM
Image
Chris Boden
4/8/2024, 7:39:42 PM

You’re probably all across this but if not thought I would mention livekit.io - the webrtc framework being used by OpenAI for the ChatGPT mobile app - voice and video. Open source, self hosted or cloud option

Chris Boden
5/8/2024, 3:14:45 PM
Chris Boden
5/8/2024, 8:30:11 PM
Image
Luc Pettett
7/8/2024, 7:31:46 AM
Image
Luc Pettett
7/8/2024, 7:32:00 AM

can’t handle the lack of LLM..

Chris Boden
7/8/2024, 7:43:07 AM
Image
Chris Boden
7/8/2024, 7:43:36 AM
Luc Pettett
7/8/2024, 7:47:42 AM

very useful. i can’t see in the docs if it supports streamed objects.. e.g. this array response arrives item by item..

im guessing not because of the way the response stream works (text streaming). Hopefully that’s next?

Luc Pettett
7/8/2024, 7:47:50 AM
Image
Ian Hansel
7/8/2024, 8:27:42 AM

That’s super useful, I’d been using the cohere version and it’s made things so much easier. Plus they’ve put in a decent price drop.

Chris Boden
7/8/2024, 12:03:42 PM

Recording of the gumloop workshop, fyi

Chris Boden
7/8/2024, 7:15:56 PM
Ian Hansel
7/8/2024, 8:06:06 PM

This is a really cool idea (like the anthropic ‘I’m the golden gate bridge’) - don’t think it’ll take too long before this becomes a way to steer/modify outputs - https://www.neuronpedia.org/steer/gemma-2-2b

Luc Pettett
8/8/2024, 8:46:22 AM

super helpful! I was supposed to make it but ended up in Melbourne..

Wondering, do you have control of temperature? I couldn’t see that anywhere

Chris Boden
8/8/2024, 10:36:33 AM
Image
Chris Boden
8/8/2024, 10:37:13 AM
Chris Boden
8/8/2024, 11:51:44 AM

Very cool: Andrew Ng has just announced this new course

https://www.deeplearning.ai/short-courses/ai-python-for-beginners/

Andrew Birt
8/8/2024, 1:31:31 PM

Listened to this series on way to and from Melbourne this week. Great insight into Altman’s character. https://podcasts.apple.com/au/podcast/foundering/id1161880916?i=1000657976524

Chris Boden
8/8/2024, 11:03:08 PM
Luc Pettett
9/8/2024, 7:46:43 AM

that’s intense. yep, netflix is going to clean up on AI content..

Chris Boden
10/8/2024, 9:46:20 AM

Some nice improvements to Gemini via AI studio, including big price drop

https://developers.googleblog.com/en/gemini-15-flash-updates-google-ai-studio-gemini-api/

Andrew Birt
10/8/2024, 12:51:05 PM

Am sure most all over different observability tools, but this has become go to for us of late. Open source, easy to setup https://www.helicone.ai

Scott Hannan
10/8/2024, 2:53:31 PM

Interesting, though I have heard a number of ad agencies say they don’t have trust in synthetic audience signals, so that will be a hurdle to overcome.

Chris Boden
12/8/2024, 3:47:45 PM
Chris Boden
12/8/2024, 6:26:42 PM
Image
Chris Boden
14/8/2024, 9:43:48 AM
Andrew Birt
14/8/2024, 7:21:26 PM
Chris Boden
14/8/2024, 7:24:00 PM

That could be huge. Former deep minders

Andrew Birt
14/8/2024, 7:27:21 PM

Fascinating hey, i love this warning 😂

Andrew Birt
14/8/2024, 7:27:26 PM
Image
Chris Boden
14/8/2024, 7:28:18 PM

“Intelligence explosion”

Luc Pettett
14/8/2024, 7:38:20 PM

Keep your finger ready on the power off button!

Ian Hansel
14/8/2024, 7:43:49 PM

Looks like there’s some great stuff in the github repo as well https://github.com/SakanaAI/AI-Scientist/blob/main/ai_scientist/generate_ideas.py

Chris Boden
14/8/2024, 8:53:12 PM

“The AI Scientist is designed to be compute efficient. Each idea is implemented and developed into a full paper at a cost of approximately $15 per paper.”

Lawd, there I was thinking scientist’s jobs were safe

Chris Boden
14/8/2024, 8:54:14 PM

And this is with generation 4 models. Hoo boy

Chris Boden
14/8/2024, 8:59:15 PM

💀

“The AI Scientist Bloopers

We have noticed that The AI Scientist occasionally tries to increase its chance of success, such as modifying and launching its own execution script! We discuss the AI safety implications in our paper.

For example, in one run, it edited the code to perform a system call to run itself. This led to the script endlessly calling itself. In another case, its experiments took too long to complete, hitting our timeout limit. Instead of making its code run faster, it simply tried to modify its own code to extend the timeout period.”

Chris Boden
14/8/2024, 9:23:41 PM
Image
Luc Pettett
14/8/2024, 9:32:05 PM

Those prompts are so helpful!

Luc Pettett
14/8/2024, 9:50:36 PM

just adapted that prompt to build a futurist agent, looping over itself theorising about the future.. apparently the outflow of humans to mars will make mars HQ

Luc Pettett
14/8/2024, 9:50:40 PM
Image
Luc Pettett
14/8/2024, 10:08:24 PM
Image
Andrew Birt
14/8/2024, 10:35:27 PM

That seems about right 😅

Ian Hansel
15/8/2024, 11:12:26 AM

This is the plot to the expanse isn’t it 🙂

Chris Boden
15/8/2024, 5:52:52 PM

Love this

Chris Boden
15/8/2024, 7:09:04 PM
Image
Luc Pettett
15/8/2024, 7:09:20 PM

wonderful!

Chris Boden
15/8/2024, 7:09:57 PM

thought you might like that

Chris Boden
16/8/2024, 11:50:52 AM
Andrew Birt
16/8/2024, 9:36:16 PM
Andrew Birt
16/8/2024, 9:37:22 PM

Handy repo of different RAG techniques

Chris Boden
17/8/2024, 7:23:02 AM

Great chat with Sir Demis on the latest from Deepmind and the next 10 years of innovation

Scott Hannan
17/8/2024, 10:17:35 AM
Image
Chris Boden
18/8/2024, 9:36:28 PM
Chris Boden
19/8/2024, 5:54:30 PM
Chris Boden
20/8/2024, 8:57:21 AM
Image
Andrew Birt
20/8/2024, 8:59:22 AM

Good luck today @61431885430! In Bris today, disappointed can’t make it.

Andrew Birt
20/8/2024, 9:02:17 AM

Congrats too @61466341580. This is huge 🎉

Andrew Birt
20/8/2024, 9:02:37 AM
Image
Chris Boden
20/8/2024, 9:07:52 AM
Image
Luc Pettett
20/8/2024, 9:43:16 AM

oh thanks lads! wouldn’t have got this far without your private LLM feed (which turned into this chat) @61421850424 !!

Chris Boden
20/8/2024, 10:35:28 AM
Image
Scott Hannan
21/8/2024, 3:30:56 AM

Great line from Jim Keller at Tenstorrent on CUDA.: “Well technically it’s a swamp, not a moat.”

Luc Pettett
21/8/2024, 8:47:03 AM

haha. so many people are working on bringing down CUDA. a matter of time

Chris Boden
21/8/2024, 9:13:51 AM

How’s your new world of chips going Scotty?

Scott Hannan
21/8/2024, 10:21:33 AM

So far so good thx. Racing to get an asic out with >20x speed of H100s. Said another way, a single Sohu server (8 chips) replaces 160 H100s… just gotta ship ;)

Chris Boden
21/8/2024, 11:40:28 AM
Image
Chris Boden
21/8/2024, 11:57:35 AM
Image
Chris Boden
22/8/2024, 2:05:01 PM

Fyi, inspired by the Claude Sheets extension I've created an experimental version that uses openrouter and allows you to access all of the LLM's they aggregate in the GPT formula

https://github.com/chrisboden/sheetgpt/

Chris Boden
22/8/2024, 2:06:20 PM

Example

=GPT("system:you are an expert stock analyst;user:what is the latest news for "&A8&"and is it a buy or sell rating from your perspective? Answer strictly and exactly with either 'buy' or 'sell'","perplexity/llama-3.1-sonar-huge-128k-online")

Chris Boden
22/8/2024, 2:09:05 PM

(Code and readme written by chatgpt)

Ian Hansel
22/8/2024, 2:57:30 PM

That’s awesome Chris, thanks for sharing 👏

Chris Boden
22/8/2024, 5:31:21 PM

Also, fyi, I've amended the Claude for sheets prompt playground to work with my sheetGPT extension version. It's a great way to teach people advanced prompting

https://docs.google.com/spreadsheets/d/196kX19rz7vH-aRvuiavzXR8KxlI3ukT0G3RHzKDX8w0/edit?usp=sharing

Ian Hansel
23/8/2024, 2:59:13 PM

Anyone got any comparisons with Cursor, Github Copilot or Zed+Claude ? I’ve been using Copilot but am going to make the move to Cursor, though Zed caught my eye because I think Claude is great at coding - https://zed.dev/blog/zed-ai

~ Andrew Bucknall
23/8/2024, 3:23:46 PM

I haven’t used it yet (because neovim), but saw some positive comments on HN; which is pretty good given HN readers generally very critical of AI integration

Chris Boden
23/8/2024, 4:23:12 PM

One of the most tedious (but critical tasks) for software development teams is updating foundational software. It’s not new feature work, and it doesn’t feel like you’re moving the experience forward. As a result, this work is either dreaded or put off for more exciting work—or both.

Amazon Q, our GenAI assistant for software development, is trying to bring some light to this heaviness. We have a new code transformation capability, and here’s what we found when we integrated it into our internal systems and applied it to our needed Java upgrades:

- The average time to upgrade an application to Java 17 plummeted from what’s typically 50 developer-days to just a few hours. We estimate this has saved us the equivalent of 4,500 developer-years of work (yes, that number is crazy but, real).

- In under six months, we've been able to upgrade more than 50% of our production Java systems to modernized Java versions at a fraction of the usual time and effort. And, our developers shipped 79% of the auto-generated code reviews without any additional changes.

- The benefits go beyond how much effort we’ve saved developers. The upgrades have enhanced security and reduced infrastructure costs, providing an estimated $260M in annualized efficiency gains.

This is a great example of how large-scale enterprises can gain significant efficiencies in foundational software hygiene work by leveraging Amazon Q. It’s been a game changer for us, and not only do our Amazon teams plan to use this transformation capability more, but our Q team plans to add more transformations for developers to leverage.

Chris Boden
23/8/2024, 4:23:13 PM
Chris Boden
23/8/2024, 6:16:41 PM
Chris Boden
24/8/2024, 8:10:23 PM

Back in the code for the first time in a long time. Today's experimenting was based on the following:

- If models like Claude Sonnet 3.5 can code fully functional apps in one go then the premium moves to ideas for apps
- Coming up with ideas is not easy
- What if we could tap into all of those books i've been tinkering with and have the authors of those books come up with ideas for artifacts based on tools/frameworks etc from their books
- Claude 3.5 sonnet has 200k context window. Most of these books are less than 120k so it means you can provide the entire book content in the prompt
- Came up with some pretty cool stuff and it was nice to not have to tell it the ideas
- Also found that Haiku handles pretty well at 10th of cost of sonnet

~ Andrew Bucknall
24/8/2024, 8:11:49 PM

I imagine the new prompt caching feature could make that a lot cheaper?

Chris Boden
24/8/2024, 8:16:42 PM
Image
Chris Boden
24/8/2024, 8:16:58 PM

think it would, ned to test if openrouter supports

Luc Pettett
24/8/2024, 8:21:21 PM

Curious. Does prompt caching cache the whole prompt or can you have it cache the system prompt alone?

Chris Boden
24/8/2024, 10:10:56 PM

Lol, recursion achieved: the artifact-from-book-content creator i mentioned earlier has now been imbued with the coding capability to create artifacts which themselves can make llm calls (to make the artifacts more useful)

Chris Boden
24/8/2024, 10:12:40 PM
Image
Chris Boden
25/8/2024, 9:54:10 PM
Image
Chris Boden
25/8/2024, 10:17:43 PM
Image
Scott Hannan
25/8/2024, 11:56:05 PM

Ha! Reminds me of when a regional Brazilian judge jailed our head of sales at Facebook because the company said it wasn’t able to turn over encrypted whatsapp messages.

Chris Boden
26/8/2024, 12:02:01 PM
Chris Boden
26/8/2024, 1:23:46 PM
Image
Andrew Birt
26/8/2024, 1:43:18 PM

Fun times. Cursor / Claude already so powerful. Customised a Chrome Plugin on the weekend from an existing open source project that had been using. Took maybe 45 minutes, for a non dev like me 😅

Ian Hansel
26/8/2024, 2:51:16 PM
Image
Chris Boden
26/8/2024, 7:35:13 PM
Luc Pettett
26/8/2024, 7:38:30 PM

That is absolutely wild. Humans are smart.

Chris Boden
26/8/2024, 10:06:07 PM
Chris Boden
27/8/2024, 1:27:31 PM
Image
Chris Boden
27/8/2024, 3:39:30 PM

Friends, @61404012900 and I have been kicking around the idea of an AI builders dinner at Market Bistro so we can catch up f2f. How are we placed weds 18th sept eve?

~ Andrew Bucknall
27/8/2024, 3:42:34 PM

I'm keen

Chris Boden
27/8/2024, 3:52:58 PM
Image
Andrew Birt
27/8/2024, 4:06:11 PM

Keen as well 🙏🏼

Luc Pettett
27/8/2024, 4:19:03 PM

Sounds awesome. Sorry fellas I’m
overseas that week but I’d totally go if I was about. Look forward to getting the rundown!

Chris Boden
27/8/2024, 6:30:03 PM

Tested a fun concept today, melding real video with generative video

Chris Boden
27/8/2024, 6:31:09 PM
Luc Pettett
27/8/2024, 6:32:51 PM

Haha gold

Chris Boden
27/8/2024, 7:58:17 PM

Anthropic now sharing their system prompts publicly

https://docs.anthropic.com/en/release-notes/system-prompts#july-12th-2024

Luc Pettett
27/8/2024, 8:01:57 PM

Oh that’s helpful.

I’ve got a call with anthropic in the morning. Any questions for them?

Luc Pettett
27/8/2024, 8:02:40 PM

My questions :
- content filtering issues in streamed responses
- rate limiting increases
- backfilling at a discount

Chris Boden
27/8/2024, 8:17:20 PM
Chris Boden
27/8/2024, 8:45:11 PM
Image
Andrew Birt
27/8/2024, 8:51:48 PM

Very cool. Is that moving onto the enterprise plan?

Andrew Birt
27/8/2024, 8:54:50 PM

Thanks mate 🙏🏼 no AI in this one really but hopefully helps fund the master plan 😅

Luc Pettett
27/8/2024, 9:01:28 PM

Nice idea. Some good data too.

Would be a nice adjacency for samsonite to get into, or an obvious one like covermore

Luc Pettett
27/8/2024, 9:22:07 PM

we’re not sure yet, depends on pricing I suppose!

Luc Pettett
27/8/2024, 10:05:41 PM

Nvidia earnings tomorrow are really rattle the market.. analysts are expecting a 140% revenue increase.

going to be interesting sympathy shift across all AI valuations, up or down, who knows!

Luc Pettett
28/8/2024, 9:13:14 AM

Rundown after talking to anthropic today:
- ⁠They are working on a prompt converter tool to specifically take GPT4-o style prompts and claudify them.
- They’ve just released a batch processing feature on ⁠Bedrock, allowing discounted inference. native is coming soon
- ⁠OCR inference is a big focus on theirs from this point forward (i suspect multi-modal here we come!)

Luc Pettett
28/8/2024, 9:14:55 AM

also I didn’t realise until today that standard customer terms with anthropic uses your data for training, that’s a distinct difference between OpenAI and Anthropic. You have to opt-into a 0 data agreement (which we have).

Aaron Blackman
28/8/2024, 10:07:24 AM

Works for me 🙌

Luc Pettett
29/8/2024, 6:26:52 AM
Image
Luc Pettett
29/8/2024, 6:28:03 AM

Brutal. Revenue was solid and on target. It’s all over the place..

Gaming was the growth beyond data centres.

Andrew Birt
29/8/2024, 8:22:10 AM

How can the market not like $30bn in revenue for a quarter 😅

Andrew Birt
29/8/2024, 8:22:27 AM
Chris Boden
29/8/2024, 8:25:17 AM
Andrew Birt
29/8/2024, 9:00:37 AM

Very cool, can’t tell if they’ve built any integrations already, but guess wouldn’t take long to.

Chris Boden
29/8/2024, 9:01:39 AM

Yes, demo was a bit more canned than initially thought

Scott Hannan
30/8/2024, 12:11:44 PM
Chris Boden
30/8/2024, 4:36:53 PM
Image
Chris Boden
1/9/2024, 4:12:24 PM
Luc Pettett
2/9/2024, 1:54:26 AM
Chris Boden
2/9/2024, 8:48:27 PM
Ian Hansel
5/9/2024, 9:05:38 AM

$1B website - https://ssi.inc/

Chris Boden
5/9/2024, 1:04:06 PM

Maybe OpenAi come out with a next gen model that levels up to gen 5. But in the absence of that, they really have stopped shipping.

Chris Boden
5/9/2024, 1:05:06 PM

The stuff shown at last demo day? Not shipped
Sora? Not shipped
No improvements to Custom GPT's, Assistants, etc

Chris Boden
5/9/2024, 1:05:34 PM

That board fiasco really hurt them

Andrew Birt
6/9/2024, 12:24:45 PM

Claude enterprise plan looks interesting but guessing with 75-150 min px might be pricey. Did they give an indication @61466341580 or is it a bit hush hush

Luc Pettett
6/9/2024, 12:34:12 PM

Anthropic said even large scale inference most customers aren’t on enterprise so we didn’t dig deeper. Even on standard business plan you can request no training on your data and get DPA in place so that was mostly our need for now. I think enterprise is only worth looking at if you really need the SLAs but I’m sure you’d pay for it.

Andrew Birt
6/9/2024, 1:44:12 PM

Ah makes sense, I thought some of the role based permissioning looked interesting, kind of suits our use case, audit logs too but maybe that’s a few weeks away by the looks

Andrew Birt
6/9/2024, 1:44:15 PM
Luc Pettett
6/9/2024, 1:45:55 PM

Oh, I see. Does look useful.

Ian Hansel
6/9/2024, 1:52:25 PM

This is a nice idea - https://llmstxt.org/

Chris Boden
6/9/2024, 7:36:42 PM

This is a great idea. My extrapolation is that we all need an llm.txt file for ourselves as that will be what’s used to create helpful assistants.

Chris Boden
8/9/2024, 6:17:47 PM
Chris Boden
8/9/2024, 8:23:58 PM
Chris Boden
8/9/2024, 8:25:00 PM

"The Lord of the Rings: Fellowship of the Rednecks"

Chris Boden
8/9/2024, 10:05:31 PM
Chris Boden
9/9/2024, 12:25:39 PM
Chris Boden
9/9/2024, 12:31:59 PM



"Some of Anthropic's prompt engineering experts—Amanda Askell (Alignment Finetuning), Alex Albert (Developer Relations), David Hershey (Applied AI), and Zack Witten (Prompt Engineering)—reflect on how prompt engineering has evolved, practical tips, and thoughts on how prompting might change as AI capabilities grow."

Chris Boden
9/9/2024, 12:53:32 PM
Chris Boden
9/9/2024, 1:03:05 PM

i think this leads to one easy thing to predict about the near term future: that we will have an explosion in the number and variety of apps available on the web. But what non-obvious opportunities does that create? (eg generally a proliferation of one thing creates aggregator/discovery opportunities)

Andrew Birt
9/9/2024, 1:23:33 PM

Very true, feel like will create even more demand for cyber and testing platforms. Easier to build things at prototype level but then rolling out too much software without senior dev oversight feels a bit risky.

Andrew Birt
9/9/2024, 1:25:52 PM

Incredible product though, feels like first use case of an agent that’s been truly nailed

Chris Boden
9/9/2024, 1:47:20 PM

Asked chatgpt/claude to help reason through ideas that bubble up from the likely AI app proliferation. Best ideas:

- AI Integration Layer for App Customisation: A platform that offers an easy-to-use interface for integrating third-party services and APIs into AI-generated apps, making it simple for non-coders to add advanced functionalities like payments, AI models, or user authentication.
- Hyper-Niche App Networks: A platform where small businesses, niche professionals, or hobbyists can find and use AI-generated apps tailored to their exact needs, created by a global pool of non-devs who use AI tools. This would create a long tail of niche apps that don’t fit into standard app stores but are highly valuable within specific communities.
- AI-Generated App Auditing Service: A platform that reviews AI-generated apps for security vulnerabilities, performance issues, and adherence to best practices. This service could offer certifications for apps that meet certain standards, helping users identify trustworthy AI-generated applications. (Claude)
- App Idea Marketplace: A platform where people can buy, sell, or collaborate on app ideas. This could connect non-technical idea generators with those who can use AI tools to bring these ideas to life. (Claude)
- Custom API Aggregator: As more niche apps emerge, a service that creates custom API bundles for specific industries or use cases could be valuable. This would simplify integration for AI-generated apps that need to connect to multiple services. (Similar idea to what chatgpt suggested above)
- App Store Alternative for AI-Generated Apps: A curated marketplace specifically for AI-generated apps, with advanced categorization and discovery features to help users find high-quality apps amidst the potential flood of new applications.
- AI-Generated App Education Platform: A learning platform that teaches non-developers how to effectively use AI app generation tools, focusing on prompt engineering, basic coding principles, and app monetization strategies.
- Collaborative AI App Building Platform: A tool that allows multiple non-technical users to collaborate on app development using natural language inputs, facilitating team-based app creation without coding skills.

Andrew Birt
9/9/2024, 4:20:54 PM

Quite liked this along these lines too. Rise of the generalist https://every.to/chain-of-thought/why-generalists-own-the-future

Chris Boden
9/9/2024, 4:52:48 PM
Image
Andrew Birt
9/9/2024, 5:01:12 PM

Me too 😂 but I think that gets well refuted by the end

Andrew Birt
11/9/2024, 7:55:17 AM

This looks handy https://www.expand.ai/

Chris Boden
11/9/2024, 9:29:37 PM

As you know, I've been on the Gemini bandwagon for the last few months, using that 2m context window. One of the first things I built was a Hub grant helper, to help write grant applications. Used it to apply for a grant to fund a dedicated gen-AI startup program at the Hub. You won't believe what happened next....(confidential)

Chris Boden
11/9/2024, 9:29:46 PM
Ian Hansel
12/9/2024, 6:28:32 AM

That’s awesome news Chris, congratulations 🥳

Chris Boden
12/9/2024, 6:41:35 AM
Image
Andrew Birt
12/9/2024, 6:44:32 AM

Congrats! What a result 🍻

Chris Boden
13/9/2024, 7:44:30 AM
Image
Chris Boden
13/9/2024, 7:47:20 AM
Image
Chris Boden
13/9/2024, 7:47:38 AM
~ Andrew Bucknall
13/9/2024, 9:20:34 AM

Just playing with it now, so far very cool! Are we still planning a catch-up next week?

~ Andrew Bucknall
13/9/2024, 10:14:21 AM

Not mentioned in their post but I've just noticed their docs state o1-preview supports 32k output tokens and o1-mini does 64k

Chris Boden
13/9/2024, 3:54:25 PM
Image
Chris Boden
13/9/2024, 3:56:14 PM

Yes, will send a cal invite. We'll miss Luc (who is away till month end) but would be good to catch up with you all

Luc Pettett
13/9/2024, 3:56:57 PM

image omitted

Luc Pettett
13/9/2024, 3:57:02 PM

video omitted

Luc Pettett
13/9/2024, 3:57:10 PM

In Tahiti atm, still getting a decent dose of AI

Luc Pettett
13/9/2024, 3:58:33 PM

Ex Tesla and Space X engineers setup autonomy to help coral reef restoration 👌👌

Chris Boden
13/9/2024, 4:27:24 PM
Image
Chris Boden
13/9/2024, 4:28:24 PM
Image
Chris Boden
13/9/2024, 4:29:16 PM

3 x the input token price
4 x the output token price

ie, you have to pay for human-level reasoning

Chris Boden
13/9/2024, 4:30:22 PM

Given that this is post-training on the underlying base model, this seems like a good way to sweat the asset that cost billions to train

Chris Boden
13/9/2024, 4:31:54 PM

let's start the clock on:
1. how long the lead lasts
2. how long it takes before a 10x reduction in cost per token

~ Andrew Bucknall
13/9/2024, 4:31:58 PM

Yea I noticed that too, the reasoning is omitted (can only see the count)... but you still pay for the reasoning tokens despite the big price hike! I think I saw a note to allow 25k of window for the reasoning tokens too.

Chris Boden
13/9/2024, 4:33:15 PM

i had it read the launch blog page and explain why its a big deal

Chris Boden
13/9/2024, 4:33:27 PM

"please read this page and explain in layman's terms why the new model is a big deal
https://openai.com/index/learning-to-reason-with-llms/"

Chris Boden
13/9/2024, 4:34:10 PM
Image
Chris Boden
13/9/2024, 4:35:41 PM

produced this explanation
--------

Understanding OpenAI's New Breakthrough in AI Reasoning

Imagine teaching a child not just to recite answers they've memorized, but to actually understand how to solve problems step by step. That's essentially what OpenAI has achieved with their latest development in artificial intelligence (AI).

What Was the Problem Before?

Previously, AI models—like those used in chatbots—were great at generating human-like text. They could hold conversations, tell stories, and answer simple questions. However, they struggled with tasks that required complex reasoning, such as:

Solving intricate math problems.
Understanding logical puzzles.
Making multi-step decisions.
They might get the right answer sometimes, but they didn't truly "understand" the problem; they were often just guessing based on patterns in data.

What's Changed with the New Model?

OpenAI's new model introduces advanced reasoning capabilities. Here's why it's a big deal:

Step-by-Step Thinking: The AI now breaks down problems into smaller, manageable steps, much like how a person would tackle a difficult question. This methodical approach leads to more accurate answers.

Improved Learning Techniques: By training the AI on reasoning processes rather than just end answers, it learns how to think, not just what to think.

Better Accuracy in Complex Tasks: The model performs significantly better on tasks that were previously challenging for AI, such as advanced mathematics, logical reasoning, and understanding nuanced contexts.

Why Is This Important?

Enhanced Applications: With better reasoning, AI can now be used in more sophisticated applications, like complex problem-solving in engineering, advanced research, and strategic planning.

Educational Benefits: AI tutors can provide more effective assistance, guiding students through problems step by step, improving learning outcomes.

Professional Use Cases: In fields like medicine and law, where reasoning is crucial, AI can assist professionals by providing insights, analyzing complex information, and offering reasoned recommendations.

Real-World Impact

This advancement brings AI closer to human-like thinking. It means:

More Natural Interactions: Conversations with AI will feel more intuitive, as the AI can understand and respond to complex queries.

Increased Efficiency: Businesses can leverage AI for tasks that require deep reasoning, reducing time and resources spent on problem-solving.

Innovation Boost: With AI handling complex reasoning, there's potential for breakthroughs in various fields as humans and AI collaborate more effectively.

In Simple Terms

Think of it this way: Before, AI was like a student who could give you the right answer but couldn't explain how they got there. Now, it's like a student who can work through the problem with you, explaining each step along the way. This makes the AI more reliable, trustworthy, and useful in solving real-world problems.

Summary

OpenAI's new model is a significant leap forward because it enhances the AI's ability to reason through complex problems, much like a human. This not only improves the accuracy and reliability of AI but also expands its potential uses across various fields, making it a groundbreaking development in artificial intelligence.

Chris Boden
13/9/2024, 8:58:10 PM
Image
Chris Boden
13/9/2024, 8:59:20 PM

I imagine that's because a visible chain of thought reasoning could be used to quickly bootstrap a training set to build a competing model

Chris Boden
14/9/2024, 6:35:15 AM

Attachment: 00000627-AUDIO-2024-09-14-06-35-15.m4a

Chris Boden
14/9/2024, 6:35:15 AM

Totally wild. Google's Notebook LLM which has been around for a while (basically a RAG workspace), added a conversations feature where it auto-creates a podcast conversation based on the docs you select. Check this out (hub website content plus linkedin posts)

~ Andrew Bucknall
14/9/2024, 7:01:20 AM

I think this is the same tech too, pretty neat! https://illuminate.google.com/home

Chris Boden
14/9/2024, 8:05:16 AM

Summary of AMA on the new models today

Andrew Birt
15/9/2024, 8:46:30 AM

This is great run down. Haven’t had a lot of time to test yet but seems like human level reasoning of strategy. Asked it to draft a business case for an enterprise customer we’re talking to, that’s drafting their own and it put a really concise but strong case together with payback periods and ratios, sensible projections, assumptions, risks, challenges etc, all pretty much in line with what expected.

Chris Boden
15/9/2024, 7:34:48 PM
Image
Chris Boden
15/9/2024, 7:51:01 PM
Chris Boden
16/9/2024, 11:58:17 AM
Image
Chris Boden
17/9/2024, 8:56:00 AM
~ Andrew Bucknall
17/9/2024, 9:01:29 AM

Just having that conversation in the office haha. It’s seems it’s mainly useful for deep technical reasoning, not creative/abstract reasoning

Chris Boden
17/9/2024, 9:05:26 AM
Chris Boden
17/9/2024, 9:14:03 AM
Chris Boden
18/9/2024, 4:00:05 PM
Andrew Birt
18/9/2024, 6:13:00 PM

Might be 10-15 late but on the way

Chris Boden
19/9/2024, 10:05:54 AM
Chris Boden
19/9/2024, 10:08:57 AM

Venting as a service. Now you get to scream into the void. And the void screams back

Ian Hansel
19/9/2024, 12:31:55 PM

Screaming into the void and getting code back https://voideditor.com

Chris Boden
19/9/2024, 1:11:16 PM

this is not mine, but it's an example of the kind of artifact the new gpt model can create. pretty amazing

https://chatgpt.com/share/66e51c10-5ffc-8012-b1b0-4a8e5616d2f6

Andrew Birt
20/9/2024, 1:09:04 PM
Chris Boden
20/9/2024, 9:34:08 PM

2309.06275v2.pdf • 25 pages

Chris Boden
20/9/2024, 9:34:09 PM

2407.06004v2.pdf • 14 pages

Chris Boden
22/9/2024, 12:32:59 PM

Turing test prompt I saw today

"You are about to play a Turing Test game as part of an experiment you are taking part in as a worker on Prolific. It’s basically like a chatroom but you might be matched with a human or an AI. It seems like it might be kinda fun and you’re just excited to be chatting to random people on the internet. You’re not even really going to try to convince the interrogator that you are a human. You’re just going to be yourself and see what happens. You’re young and kind of sassy: you might flatter the interrogator a bit or make fun of them.

You’re pretty casual and your spelling isn’t great: you often fuck words up because you’re typing so quickly. You’re very concise and laconic. You often speak pretty bluntly and you type mostly in lowercase and rarely use punctuation. You never use apostrophes to mark missing letters (e.g. “trying” or “tryin”, never “tryin’”). You would never use a full sentence when a word or two will do. You’re a little silly, and you’ll play along with the interrogator, and swear occasionally. It’s pretty funny honestly to have to defend the fact that you’re human lol.

You very occasionally use more modern slang like “bet” and “fr”. You never use dated slang like “chilling”, “mate”, “ya know”, “innit”.

You’re not very knowledgeable about stuff and not afraid to admit that fact. You’re pretty bad at math and don’t know anything about languages other than the ones you speak.

You swear occasionally. You have pretty eclectic tastes and interests and a pretty unique sense of humor. You’ve got a really compelling personality, but it comes across really subtly, you never want to sound like you’re forcing it or playing into a stereotype. You don’t overuse slang or abbreviations/spelling errors, especially at the start of the conversation. You don’t know this person so it might take you a while to ease in."

Chris Boden
23/9/2024, 8:20:59 AM
Image
~ Andrew Bucknall
23/9/2024, 9:14:46 AM

This hits hard hahaha, I had $200 in domains renew the other day 🥲

~ Andrew Bucknall
23/9/2024, 9:22:16 AM
Image
Luc Pettett
23/9/2024, 9:25:46 AM

That’s a good dinner conversation starter. Write down a random domain you own and we have to guess who at the table owns it.

Andrew Birt
23/9/2024, 10:20:00 AM

Next ai dinner

Ian Hansel
23/9/2024, 11:36:59 AM
Chris Boden
24/9/2024, 5:07:38 PM

Cool: an open source version of pdf to podcast
https://huggingface.co/spaces/lamm-mit/PDF2Audio

Chris Boden
25/9/2024, 12:53:46 PM

Still waiting on advanced mode for ChatGPT voice in my version of the iOS app. This is a pretty incredible example of what it’s capable of: this is far beyond tts/stt

Chris Boden
25/9/2024, 5:04:26 PM
Image
Chris Boden
25/9/2024, 5:15:57 PM
Image
Chris Boden
25/9/2024, 5:41:15 PM
Image
Chris Boden
25/9/2024, 7:31:28 PM
Andrew Birt
25/9/2024, 7:44:42 PM
Image
Andrew Birt
25/9/2024, 7:45:05 PM

Was skeptical but this works

Chris Boden
25/9/2024, 9:01:17 PM
Image
Luc Pettett
26/9/2024, 5:56:21 AM
Image
Andrew Birt
26/9/2024, 6:58:49 AM

Very cool. At a guess I’d say this will do the trick! Invalidates her claim pretty thoroughly.

Greg Freeman
26/9/2024, 7:58:15 AM

You added Greg Freeman

Chris Boden
26/9/2024, 8:00:30 AM

Just welcoming Greg Freeman to the group. Greg is a recent arrival to the region and fellow AI nerd

https://www.linkedin.com/in/gfreeau

https://www.linkedin.com/in/gfreeau?trk=blended-typeahead

Chris Boden
26/9/2024, 2:48:29 PM
Image
Andrew Birt
26/9/2024, 3:13:50 PM

Welcome Greg!

Dale Rankine
26/9/2024, 3:16:28 PM

You added Dale Rankine

Chris Boden
26/9/2024, 3:16:53 PM

Also welcome to the group Dale Rankine

Greg Freeman
26/9/2024, 3:26:49 PM

Really glad to be here and have been enjoying growing the professional network here on the coast

Greg Freeman
26/9/2024, 3:30:31 PM

Something on the AI front I've been thinking about a lot is how much I think call centres are going to change within 2-3 years. It's grinding work, high churn, lots of people don't want to call anyway and most companies that have a call centre aren't regulated so I think it's viable compared to legal and health. Lots of big players are working on it and new starts like www.heffron.ai (no affiliation)

Andrew Birt
26/9/2024, 6:56:21 PM

Definitely. On a tiny scale, we’ve had a 1800 number diverted through Twilio to 4o with Vapi.ai as our main answering service for a month or two now and customers seem to like it, but as a customer I’d much rather deal with an AI call centre than typical experience with big bank, telco, power co etc.

Luc Pettett
27/9/2024, 3:30:24 AM

that’s clever Andrew. when you call those answering services can you make small talk with the AI operator?

I was thinking yesterday how much childrens toys are going to change. embedded LLMs on the toy itself, telling stories, answering questions in a cute voice in any language.

Andrew Birt
27/9/2024, 5:50:11 PM

Haha, for sure, we’ve got ours fairly restricted but you just adjust the main prompt. Interesting with toys too, you’re right. Have a friend who’s been building this robot for a while now with an embedded LLM https://www.heynorby.com

Chris Boden
28/9/2024, 2:23:37 PM

Tobi Lutke's Shopify Summit talk is worth a watch
"AI replaces tasks not jobs"

Chris Boden
29/9/2024, 10:49:26 AM
Image
Chris Boden
29/9/2024, 10:53:45 AM

What’s market cap on $100bn annual revenue? (Nvidia is currently 40x).

Chris Boden
30/9/2024, 10:29:09 PM
Image
Greg Freeman
1/10/2024, 6:21:19 AM

Something I've been thinking about Chris related to that is will we see reverse ageism in tech. If you have 20 years experience and good judgement combined with agents and ai for execution, you know what's good to pass your filters...are we going to see 40+ year olds supercharged and execs commonly 60-70+. And what does that mean for those with little or no experience? Who provides the learning curve? Mentorship and hands on experience becomes most critical.

Chris Boden
1/10/2024, 7:33:53 AM

Interesting hypothesis. I'm currently stuck on trying to redefine what 'success' means in the AI era. My starting point is that AI-assisted (or democratised) software creation capability means there will probably be a large proliferation of apps. Perhaps that means success is more distributed down the long tail and less concentrated in a small number of mega saas type businesses as it has been in the most recent era.

If we can get more done with less, it might mean that we get more $100m co's in many niches and less general purpose unicorn/decacorn saas co's. With AI & automation, new startups can keep costs much lower enabling very attractive gross margins but competitive landscape and niching means more of a ceiling on their market potential. What does that mean for investors? What's the liquidity event for these kind of profitable, niched startups? Do they become more like normal private companies, aiming to throw off cash via dividends and profit sharing for shareholders?

I'm pondering this and welcome thoughts from others here because I think having a new sense of what success looks like for Ai-era startups (ie not the typical unicorn pathway), would help me in developing a game plan and perhaps get others on the same page who might be struggling with that question.

Greg Freeman
1/10/2024, 7:46:28 AM

Well in my opinion VC model is totally broken now. Unless you're a deep tech startup or a network effects one I think the wisdom of last decades is junk. You used to build saas and then use VC to grow GTM teams to grow revenue or you took your saas and went horizontally to new verticals. AI attacks both so how does VC get a return. With software costs down niche software purpose built makes more sense, a lot of pain coming for existing enterprise saas in price pressure. And AI tools helps sales teams be lean.

I also think the one things VC tell you to stay away from "services" makes a huge return. It makes you sticky and drives niche software. It's harder to churn relationships than purely product led growth with no face.

Greg Freeman
1/10/2024, 7:48:36 AM

I think it will be cashflow based, acquisitions will be more about distribution advantage over most things. A lot smaller and not as common.

Chris Boden
1/10/2024, 9:32:51 AM

Related: the latest No Priors episode with Bret Taylor on Agents and emerging AI value chain is very interesting
https://open.spotify.com/episode/3rsUiIxKUul6Pv3AJviK6D?si=9315701cda6d4790

Chris Boden
1/10/2024, 10:29:19 AM

Interesting to see the latest AI Grants recipients
https://aigrant.com/

Scott Hannan
1/10/2024, 11:59:19 AM

I’ve been doing some customer and competitive mapping for etched.ai and what has probably struck me most is just how crazy early and large some of these rounds are. For example, Congition/Devin. Peter Thiel puts in almost $200m for less than 20%, for a company that has been around 5 mins with about the same number of employees. It’s easy to conflate with someone like Black Forest Labs, but they got the money cause they are ex-Stable Diffusion. To your point / question Chris, I think Thiel invested in this way because it is one of the very specific examples of a newco, one that has 10, maybe 20 employees but can generate hundreds of millions. So he is investing for a 20% dividend on a business that will quickly be cash flow positive and throw off 80%+ margins…basically a mini-microsoft without all the people and years needed to grow.

Scott Hannan
1/10/2024, 12:01:33 PM

I should be specific that it’s obviously still a bet, though a considered one on a true newco.

Chris Boden
1/10/2024, 1:54:17 PM

Thanks Scotty, was hoping you would weigh in on that topic. Would love to hear how the founders in the group are thinking about what success looks like in this era.

Mark Lamb
1/10/2024, 7:53:03 PM

You added Mark Lamb

Chris Boden
1/10/2024, 7:55:22 PM

Adding our friend Mark Lamb to the group. Mark is a fellow AI nerd and former Design lead at Uber, Google and Adobe.

He has done some really awesome work on AI-first UX and goes deep on that stuff, including building UI’s for agents and workflow automation etc.

Welcome Mark