jspisak's comments

jspisak · on Aug 24, 2023

Partner integrations will follow. For now we just have the weights available.

But don't worry, this community moves fast!

Eddygandr · on Aug 24, 2023

Probably superseded (by y’all) within a week!

jspisak · on Aug 24, 2023

It would be interesting to understand if a ~30B Llama-2 model would be interesting and for what reasons.

Tostino · on Aug 24, 2023

Better reasoning and general performance than 13b by far (if llama1 was any indication), and like the other user said, can fit on a single 24gb vram gaming card, and can be peft fine-tuned with 2x 24gb cards.

airgapstopgap · on Aug 24, 2023

Llama-1-33B was trained on 40% more tokens than LLama-1-13B; this explained some of the disparity. This time around they both have the same data scale (2T pretraining + 500B code finetune), but 34B is also using GQA which is slightly more noisy than MHA. Furthermore, there have been some weird indications in the original LLama-2 paper that 34B base model is something… even more special, it's been trained on a separate internal cluster with undervolted/underclocked GPUs (though this in itself can't hurt training results), its scores are below expectations, it's been less "aligned". Here, Code-Llama-Instruct-13B is superior to 34B on HumanEval@1. So yes, it's desirable but I wouldn't get my hopes up.

brucethemoose2 · on Aug 24, 2023

Llama 34B is just big enough to fit on a 24GB consumer (or affordable server) GPU.

Its also just the right size for llama.cpp inference on machines with 32GB RAM, or 16GB RAM with a 8GB+ GPU.

Basically its the most desirable size for AI finetuning hobbyists, and the quality jump from llama v1 13B to llama v1 33B is huge.

hnuser123456 · on Aug 24, 2023

It would fit on the 24GB top-end consumer graphics cards with quantization.

jspisak · on Aug 24, 2023

If you watch the Connect talks, I'll be speaking about this..

coder543 · on Aug 24, 2023

I wish that Meta would release models like SeamlessM4T[0] under the same license as llama2, or an even better one. I don't understand the rationale for keeping it under a completely non-commercial license, but I agree that is better than not releasing anything at all.

There seem to be opportunities for people to use technology like SeamlessM4T to improve lives, if it were licensed correctly, and I don't see how any commercial offering from smaller companies would compete with anything that Meta does. Last I checked, Meta has never offered any kind of translation or transcription API that third parties can use.

Whisper is licensed more permissively and does a great job with speech to text in some languages, and it can translate to English only. However, it can't translate between a large number of languages, and it doesn't have any kind of text to speech or speech to speech capabilities. SeamlessM4T seems like it would be an all-around upgrade.

[0]: https://github.com/facebookresearch/seamless_communication

jspisak · on Aug 24, 2023

Yeah - different projects have different goals and licenses aren't one size fits all. Depending on the project, type of technology, goals, etc.. we will select or even develop the right license that aligns with those goals. Hope this helps :)

currio · on Aug 24, 2023

Excited! I hope your talks are just as informative as this comment. Keep rocking!

lolinder · on Aug 24, 2023

Sorry—who are you and what are the Connect talks? I haven't heard of them and you don't have a bio.

dewey · on Aug 24, 2023

I guess that's what he is referring to: https://www.linkedin.com/posts/jspisak_home-fully-connected-...

jph00 · on Aug 24, 2023

That'll be Joseph Spisak, Head of Generative AI Open Source at Meta AI.

titaniumtown · on Aug 24, 2023

what is that?

darrenf · on Aug 24, 2023

Facebook Connect is what used to be called Oculus Connect. Kinda their equivalent of Apple's WWDC, I guess. It's when and where the Quest 3 will be officially unveiled in full, for example.

jspisak · on Aug 24, 2023

Yep - here is the site: https://www.metaconnect.com/en/home

jspisak · on Nov 1, 2022

thanks for posting this - you can learn more here: https://ai.facebook.com/blog/protein-folding-esmfold-metagen...

Cheers!

jspisak · on Sept 12, 2022

It's really awesome to finally see this landed. Congrats to the team and to the PyTorch community!

jspisak · on Dec 7, 2019

PyTorch Mobile is a start and is available for iOS and Android. Given folks like PFN and Microsoft are (or will be heavy contributors) i would expect support for more devices to broaden. Have you tried it out yet? No need for a separate set of op semantics or framework.. :) https://pytorch.org/mobile/home/

m0zg · on Dec 7, 2019

Anything that can't use mobile GPU (or DSP/TPU for quantized inference) is pretty useless IMO, because it's just not energy efficient enough to be practical in a battery powered device, even if it's fast enough.

Nimitz14 · on Dec 8, 2019

Once pytorch is updated to use XNNPACK (being worked on right now) I think it should be fine to use. That plus QNNPACK makes inference quite low on power usage in my (admittedly limited, just integrated XNNPACK) experience.

m0zg · on Dec 8, 2019

As a rule, CPU burns at least 5x the energy per FLOP. So no, CPU is not a viable option on mobile if you need to do inference constantly. For "every now and then" cases, sure.

Nimitz14 · on Dec 9, 2019

Interesting, thanks.

jspisak · on Dec 7, 2019

We have been working with G on TPU support for PyTorch. Have you tried it out? https://github.com/pytorch/xla

jspisak · on Dec 6, 2019

Finally the solution to all of your PyTorch citation problems! :)

crazypyro · on Dec 6, 2019

Relevant Github issue: https://github.com/pytorch/pytorch/issues/4126

godelski · on Dec 7, 2019

It is surprising to me that they don't put this in the README.

(I do know it is in the root directory[0], it is just common practice to have it in the README. To be stupidly obvious)

[0] https://github.com/pytorch/pytorch/blob/master/CITATION

jspisak · on Nov 13, 2019

Yeah, generally a cloud based approach to training models is better than on your mac. AMD folks have been working on RoCM for some time and it works reasonably well for common models but the AMD GPU HW isn't pervasive like Nvidia GPUs in places like AWS.

If you are looking at using Colab for prototyping, you can also try TPUs which are now supported for PyTorch. Here is the link some additional info including some Colab NBs: https://github.com/pytorch/xla

jspisak · on Jan 5, 2017

Amazon backing MXNet is pretty awesome and having it be a 'real' open project is great for the community..