More

russ · on Dec 22, 2024

This is very cool! I’ve wanted something like CodeMic for a long time.

Back when I was at Twitter, we used Review Board for code reviews (this was in 2009, before GH was a thing for most companies). It was tough to thoughtfully review large branches, especially for parts of the codebase that I wasn’t familiar with. I remember thinking, if I could somehow record the development process for a PR I was reviewing, it would be easier to understand what the submitter was trying to accomplish and how they went about doing so. I found myself more so reviewing code style instead of functionality, architecture, or design.

I watched most of the intro video, but didn’t go deeper on the site. Does CM integrate easily into the code review/PR process? I suppose I could just attach a link in any PR description?

Great work!

seansh · on Dec 22, 2024

Thanks a lot! I have thought about it being useful for learning, for fun, as a new kind of documentation, or even for onboarding new hires but the use case for code review didn't occur to me. That's great. I can think of 3 ways to share sessions:

- attach a link as you said

- once the web player is ready, it could perhaps be integrated into the code review tool. It'll be like embedding a youtube video

- the entire recorded session can be exported as a zip file and attached to the PR or shared privately

jagged-chisel · on Dec 23, 2024

As the reviewer, I would like to annotate the replay session along the way and have those annotations available to other reviewers and the PR author.

russ · on Dec 23, 2024

Yeah, that would be really handy too.

russ · on Dec 9, 2024

Haven’t played Codenames in a long while, but made this 8 years ago to play with family and friends on TVs. Right in time for the holidays!

demo: https://dsa.github.io

code: https://github.com/dsa/dsa.github.io

j_bum · on Dec 9, 2024

Amazing! Thanks for sharing

russ · on Dec 16, 2024

You got it! Hope you have some fun with it. :)

russ · on Nov 15, 2024

This is super cool. One neat idea: when I'm in offline mode, I can clone my voice, provide some context data/sources, and have my AI clone answer calls for me. It can give me a summary of conversations it had each day and allow me to follow up.

bigmicro · on Nov 15, 2024

We’re definitely heading in that direction and currently experimenting with LiveKit’s Agent framework. I’m guessing you’re Russ from LiveKit? If so, I’m a huge fan of what you’re doing! Would love to connect and explore ideas further [email protected]

russ · on Nov 15, 2024

Haha yup, I’m that Russ. Really appreciate your kind words. <3

I’ll shoot you an email and let’s chat!

russ · on Oct 5, 2024

Field CTO — hi @Sean-Der :wave:

Fractional CTO sounds like a disaster lol

throw14082020 · on Oct 6, 2024

My bad, he was Field CTO.

russ · on Oct 5, 2024

Which components feel ad hoc?

In most real applications, the agent has additional logic (function calling, RAG, etc) than simply relaying a stream to the model server. In those cases, you want it to be a separate service/component that can be independently scaled.

fidotron · on Oct 5, 2024

Essentially I think the Livekit value is a SFU that works, with signalling, and the SDKs exist. My experience is people radically overstate how hard signalling is, and underestimate SFU complexity, especially with fast failover.

In terms of being a higher level API arguably it is doomed to failure, thanks to the madness of the domain. (The part that sticks in my mind is audio device switching on Android.) WebRTC products seem to always end up with the consumer needing to know way more of the internals than is healthy. As such I think once you are sufficiently good at using LiveKit you are less likely to pick it for your next product because you will be able to roll your own far more easily. That is unless the value you were getting from it actually was the SFU infrastructure and not the SDKs.

The OpenAI case is so point-to-point that doing WebRTC for that is, honestly, really not hard at all.

russ · on Oct 5, 2024

You really don’t need to know about WebRTC at all when you use LiveKit. That’s largely thanks to the SDKs abstracting away all the complexity. Having good SDKs that work across every platform with consistent APIs is more valuable than the SFU imo. There are other options for SFUs and folks like Signal have rolled their own. Try to get WebRTC running on Apple Vision Pro or tvOS and let me know if that’s no big deal.

fidotron · on Oct 5, 2024

> Try to get WebRTC running on Apple Vision Pro or tvOS and let me know if that’s no big deal.

[EDIT: I probably shouldn't mention that]. I have some experience of getting webrtc up on new platforms, and it's not as bad as all that. libwebrtc is a remarkably solid library, especially given the domain it's in.

I obviously do not share your opinion of the SDKs.

russ · on Oct 5, 2024

Heh, actually I'm pretty sure I've come across your X profile before. :) You're definitely in a small minority of folks with a deep(er) understanding of WebRTC.

russ · on Oct 5, 2024

There’s Ultravox as well (from one of the creators of WebRTC): https://github.com/fixie-ai/ultravox

Their model builds a speech-to-speech layer into Llama. Last I checked they have the audio-in part working and they’re working on the audio-out piece.

russ · on Oct 5, 2024

There’s a ton of complexity under the “relatively simple use case” when you get to a global, 200M+ user scale.

gastonmorixe · on Oct 5, 2024

80% of the times I’m experiencing choppy audio on my iPhone 15 Pro Max (18.1b) on Voice Mode (Standard and Advanced). My internet connection is FTTH and WiFi 7 state of the art router.

I wonder if this is because bugs or the crazy load livekit may be going through given the popularity in ChatGPT voice modes right now.

russ · on Oct 5, 2024

Doesn’t sound right. I’d love to dig into this some more. Would you mind shooting me a DM on X? @dsa

russ · on Oct 4, 2024

We had our playground (https://playground.livekit.io) up for a few days using our key. Def racked up a $$$$ bill!

wordpad25 · on Oct 5, 2024

How much is it per minute of talking?

russ · on Oct 5, 2024

50% human speaking at $0.06/minute of tokens

50% AI speaking at $0.24/minute of tokens

we (LiveKit Cloud) charge ~$0.0005/minute for each participant (in this case there would be 2)

So blended is $0.151/minute

shayps · on Oct 5, 2024

It shakes out to around $0.15 per minute for an average conversation. If history is a guide though, this will get a lot cheaper pretty quickly.

cdolan · on Oct 5, 2024

This is cheaper than old cellular calls, inflation adjusted

russ · on Oct 4, 2024

I had no idea! <3 Thank you for sharing this, made my weekend.

russ · on Oct 4, 2024

It's using the same model/engine. I don't have knowledge of the internals, but a different subsystem/set of dedicated resources though for API traffic versus first-party apps.

One thing to note is there is no separate TTS-phase here, it's happening internally within GPT-4o, in the Realtime API and Advanced Voice.

gastonmorixe · on Oct 5, 2024

Thanks