Hacker Newsnew | past | comments | ask | show | jobs | submit | semisight's commentslogin

Guessing as to what the GP meant--coral TPUs max out around 8M parameters, IIRC. That's a few orders of magnitude less than the smallest LLM model.



+news/rms.md

Oh boy.



Indeed. Curiously enough I couldn't find the file listed in the commit.


Well this strikes close to home. During grad school, I'd drink a red bull (or two) a day. Now, I've quit caffeine completely (save a tiny cup of coffee in the morning occasionally).

I hope the author has a quick and easy recovery.


Playing them against each other would be a pretty cool way for the two companies' AI teams to compete though.


5d KGS is a world away from professional play, Google would win every game.


That would be equivalent to Lebron James going head to head with a disabled 5 year old.


No, it would be equivalent to LeBron James going head to head with a talented amateur basketball player. Of course he would still win every time, but it would at least be a game of basketball.


Facebook is heavily invested in Torch, a learning framework built in Lua.


A Neural Turing Machine can learn programs.


This is huge. If they really do offer such a perf/watt advantage, they're serious trouble for NVIDIA. Google is one of only a handful of companies with the upfront cash to make a move like this.

I hope we can at least see some white papers soon about the architecture--I wonder how programmable it is.


There's no way Google lets this leave their datacenters. Chip fabrication is a race to the bottom at this point. [1]

Google is doubling down on hosting as a source of future revenue, and they're doing that by building an ecosystem around Tensorflow.

What I think is interesting is how weak Apple looks. Amazon has the talent and money to be able to compete with Google on this playing field. Microsoft is late, but they can, too.

Where's Apple? In the corner dreaming about mythical self-driving luxury cars?

[1]: http://spectrum.ieee.org/semiconductors/design/the-death-of-...


Apple designs their own CPUs. I think they'd be able to field a massively parallel FMAC chip if they thought that was a good idea.

Where Apple really looks weak is in datacenters, networking, and cloud services.


What does the iPhone of 2021 look like?

I get the feeling from today's announcements that Google sees the 2021 version of Google Now as the selling point for their 2021 Nexus line.

I don't think Apple is preparing to compete on that.


Apple's strength is in consumer (and to a lesser extent, developer) ecosystems; the cozy comfortable bubble you get when you're surrounded by everything Apple. Getting access to your stuff across multiple devices is virtually effortless and continually seamless, with almost no configuration required.

Whether that's good or not may be arguable, but it's certainly a selling point for many and I don't see Google or any other company's offerings approaching the same experience, and I suspect that's by design; they have to be more open and support all devices but that kinda dilutes everything. Apple will only get stronger in that aspect IMO.


I would say they're already not competing on the assistant side. Siri is considerably worse than Google Now, even though it came out first


I'm not sure I agree with you, long term (about the chips). I think that the value here is in the ecosystem. If Google can compete with CUDA, they'll be doing really well.


> There's no way Google lets this leave their datacenters. Chip fabrication is a race to the bottom at this point. [1]

I’d hope someone somewhere steals the blueprints and posts all of them publicly online.

The whole point of patents was that companies would publish everything, but get 20 years of protection.

But by now, especially companies like Google don’t do so anymore – and everyone loses out.

EDIT: I’ll add the standard disclaimer: If you downvote, please comment why – so an actual discussion can appear, which usually is a lot more useful to everyone.


There's little need for anyone to steal the blueprints. It's unlikely there's anything particularly "special" there other than identifying operations in Tensorflow that take long enough and are carried out often enough and are simple enough to be worth turning into an ASIC. If there's a market for it, there will be other people designing chips for it too.


Same misuse happens with copyright. Both were invented to foster publishing and not creating life-long monopolies. The life times of copyright and patents must be way shorter as well. Everyone builds on something that came before. It's impossible to build a better bike if you have to test drive on a street with patent mines.

Re EDIT: Downvotes must be comment-mandatory or not allowed otherwise.


I don't see any mention of offering these chips for sale. You can rent them it seems via cloud offerings & that's it.


Sure, but that's the deal. I'll buy the latest nVidia 1080 card as soon as I can but renting these custom chips per minute would be a way better option for me.


GPUs also have this nice side effect of being great at playing games on. Purely as a guess I'd think that the gaming market is bigger than the AI researcher market.


In a future where AI is everywhere, Nvidia hopes it can sell GPUs by the hundreds and thousands to large data centers. You can make a lot more money a lot faster selling your hardware this way, and Nvidia is very interested in it judging from how much they talked about it at their recent conference.


I would be surprised if they weren't working on their own specialised chips then, though Google have the advantage of already having the software specs to build for.


> Purely as a guess I'd think that the gaming market is bigger than the AI researcher market.

Machine learning isn't just targeting the AI researcher market though -- it's widely used by a huge number of companies, and of course, by many of Google's most important products. I would argue that those markets combined are larger than gaming.


Yup, I assume they're gonna keep them in house as a competitive advantage for a time. I doubt they'll do it forever; the most valuable part of NVIDIA's CUDA is the ecosystem, and I think Google knows that.


I would assume that the API to use these is tensorflow.

So... just use Google's machine learning cloud thingy.

The software can build the community, where the supercharging is only available when you run it on Google cloud.

(although GPU performance isn't bad either, so you don't have to, thus community)


Quantum computers, OpenPower, RISC-V, and now this - I'm really liking Google's recent focus on designing new types of chips and bringing some real competition into the chip market.


What are they doing with RISC-V?


They dumped a bunch of money into it, so presumably they're at least interested.


it's ASIC tuned for specific calculations, I'm sure it's better power consumption than general purpose GPUs. Same as crypto mining ASIC's crush GPU's in terms of power efficiency.

There isn't much data yet but I'm also guessing they probably have access to much more RAM than NVidia cards and can process much bigger data sets


I'm surprised by the perf claims. Nvidia isn't doing kids play. The graph implied they were untouchable in terms of perf...


Ndvidia has to be general purpose. This is not and thus can be better optimized.


"General purpose" isn't that general, if you look at the actual operations they support and their threading model. It's already fairly optimized for these sorts of operations, and this amount of claimed headroom makes me suspicious.


Google has a lot of potential options that NVidia doesn't have. They can size their cache heirarchy to the task at hand. They can partition their memory space. They can drop scatter/gather. They can gang ALUs into dataflows that they know are the majority of machine learning workloads. They can partition their register file at the ISA level or maybe even drop it entirely. They can drop the parts of the IEEE754 floating point spec they don't need and they can size their numbers to the precision they need.


The fact that I can compile arbitrary programs for the GPGPU means it is general purpose. NVIDIA isn't writing softmax or backprop into silicon as a CPU instruction.

Look at how much faster ASICs for bitcoin mining are than the GPU... orders of magnitude.


"Backprop" isn't even close to something that would be a "CPU instruction", it's an entire class of algorithm. It's like saying "calculus" should be a CPU instruction. Matrix multiplication & other operations, on the other hand, do neatly decompose into such instructions, which have been implemented by NVidia et al., since that's the core set of functionality they've been pushing for like a decade now.

Additional die space on additional functionality might hurt the power envelope (which is where the focus on performance / watt rather than performance kicks in) but it doesn't make your chips slower per se.


That was my impression too. ML under the hood was a lot of linear algebra, not very different than most shaders. But maybe Google decided to hardcode a few important ML primitives because the ROI was that good in terms of grabbing customers. Also they might have very large scale applications not found elsewhere that motivates this.


Ok I was obviously oversimplifying things but my point is since we can only speculate, it's clear that when you know specific algorithms/math operations/memory layouts/applications you want to optimize for you can create dedicated chips that optimize and do that quickly. That bitcoin miners are all dedicated chips and run circles around GPUs demonstrates exactly this fact.

Furthermore the fact that ML can be error tolerant means you also get to optimize certain floating point operations for speed or energy efficiency at the cost of accuracy. NVIDIA doesn't get to do this in their linear algebra support.


Bitcoin mining is an extremely well-defined task compared to machine learning. It remains to be seen how general these TPUs are in practice - whether they will support the neural network architectures common two years from now.


tbh I felt like realizing what you meant earlier at the end of my comment. I should have ps'd it.


If they balance compute to memory better than GPUs, you could definitely see a 10x. GPUs have large off chip memory and small caches (like 256kb). Cost to going to off chip memory can be 1-2 orders of magnitude more than on chip memory. You can certainly fit 4+MB on modern processors, but they likely bought designs from a company like Samsung because designing high performance, low power memory cells is tricky. I'm surprised they were able to keep things a secret.


What graph?


In the talk there was a two bar graph

    {(others, ~bottom) (google, ~top)}
Couldn't see more, but after Nvidia claiming overwhelming power with their latest GPU architecture including in the ML domain .. I was surprised.


For 3., I've been interested in the same. How would it work though? Would it include non-free components, or be open source? If it's open source, why would the majority of people pay?

I say this as someone who would definitely pay. I'd love to see something like an XPS developer edition, but with a (relatively) coherent graphics paradigm like OS X.


> If it's open source, why would the majority of people pay?

We could have paid access to the repositories. Yes anyone can redistribute it, but it's cumbersome and it gives you a delay for the bugfixes, so people with the ability to pay will prefer the official method. Especially if they know they're funding open-source and if they get a good service. However, to stay legit in the OSS ecosystem we need to redistribute part of the donations.


+1 from me. My favorite feature for browsing Reddit is the RES collapsing button.

If HN declines to add it, maybe someone can make an `HNES'?


I use this chrome extension ->

https://chrome.google.com/webstore/detail/hn-special-an-addi...

You have to disable their hideous theme, but it adds collapsible comments and infinite scroll on the home page which is pretty neat.


I like HN Enhancement Suite:

http://i.imgur.com/XJOS9oz.png


I use HackerNew. I'm not sure how the features compare, but I like it overall.

https://chrome.google.com/webstore/detail/hackernew/lgoghlnd...



Almost all of the open source software in the area is permissive-licensed, and relies on non-free components (CUDA).

To be honest, I'm not sure how Gneural plans to compete with those packages without support from CUDA or cuDNN, all of which are distinctly not open source.


FANN is GNU licensed (LGPL 2.1), doesn't rely on non-free software, is written in C,so it's the same as Gneural in those regards. But it also is way more mature, has more features, compiles and runs on Linux, Windows and MacOS,and has bindings to 28 other languages.


Gneural will compete with Theano sort of like how the GNU Hurd competes with Linux


I know you're just trying to be funny, but I don't think it's funny at all.

The Linux kernel undoubtedly many features that the Hurd system lacks, but that is due to the severe lack of manpower of the latter system and the billions of Dollars being poured into the former.

On the other hand the Hurd has features that the Linux kernel can never hope to achieve because of its architecture.


> The Linux kernel undoubtedly many features that the Hurd system lacks, but that is due to the severe lack of manpower of the latter system and the billions of Dollars being poured into the former.

That's why GNU Hurd is essentially a dead project. Sadly it never attracted the attention and manpower necessary for it to survive.

> On the other hand the Hurd has features that the Linux kernel can never hope to achieve because of its architecture.

For example?


Fault isolation. We're doing it for daemons, we're doing it for web browsers, it is insane we're not doing it for operating system services. I bought a graphic tablet and the first time I plugged it into my laptop the Linux kernel crashed. And this was merely a faulty driver, not even malicious hardware.

Also think of the effort it took to introduce namespaces to all the Linux subsystems. After a decade the user namespace still has problems. This is ridiculously easy on a distributed system, yet very hard on a monolithic one.


I am not trying to be funny. I am dead serious. Aeolos explained it perfectly.


In other words, not at all


That's not necessarily relevant though. I'm sure the FSF would love to see Free Software replace all proprietary software, but in the end, the real point is that Free Software options are available to the people who want them. This isn't like a battle between commercial entities where market share is king and a project will be dropped if it isn't profitable. Gneural will be a success if a community forms around it and people work on it and use it, however small that community might be.


The problem I have with it is that they could be contributing their brain power and time to other open source projects instead of recreating the wheel for very little benefit. Take my opinion with a grain of salt, as I consider the more restrictive Copyleft licensing as new /loss/ for society.


To be honest, I'm not sure how Gneural plans to compete with those packages without support from CUDA or cuDNN, all of which are distinctly not open source.

I don't see the point either. Gneural will probably never be better than Theano, Torch, Tensorflow, Caffe, et al., which are already open. If anything, time/resources are much better invested in contributing to a polished/competitive OpenCL backend to one of these packages.


Caffe has an OpenCL backend - https://github.com/BVLC/caffe/tree/opencl


I'd really like to understand the reasons behind the focus on CUDA and not OpenCL. My understanding is that nVidia and AMD made sure their hardware and software would make the GPU accessible for non-graphics tasks, but AMD's version is not functionally or legally locked to their hardware. Why hasn't OpenCL taken off and run on nVidia hardware?

It seems like there must be more at play, but I'll admit a lack of insight and imagination on this one.


It seems like there must be more at play, but I'll admit a lack of insight and imagination on this one.

I think the reasons are twofold: 1. CUDA had a big headstart over OpenCL. 2. NVIDIA has invested a lot in great libraries for scientific computing. E.g. for neural nets, they have made a library of primitives on top of CUDA for neural nets (cuDNN), which has been adopted by all the major packages.


Performance. OpenCL has been 2-5x slower for ML than CUDA. Not sure of the exact reason but I think it's the highly optimised kernels which are not there with OpenCL, but are with CuDNN. I think it's mostly a software issue, compute capacity in theory should be more or less the same with equivalent AMD/NVidia cards.

AMD should have invested much more heavily into ML, if they had, their share price would probably look a bit better than it does now.

This looks interesting - running CUDA on any GPU. http://venturebeat.com/2016/03/09/otoy-breakthrough-lets-gam...


I recall hearing that CUDA has much more mature tooling. Not only the already mentioned cuDNN, but the CUDA Toolkit [0] seems like a really comprehensive set of tools and libraries to help you with pretty much anything you might want to compute on a GPU.

Also somewhat related: AMD seems to be moving towards supporting CUDA on its GPUs in the future: http://www.amd.com/en-us/press-releases/Pages/boltzmann-init...

[0] https://developer.nvidia.com/cuda-toolkit


On closer inspection, it looks like AMD's CUDA support consists of "run these tools over your code and it will translate it so your code does not depend on CUDA"...

Its sort of supporting CUDA, just like a car ferry sort of lets your car 'drive' across a large body of water.


Because it requires nVidias cooperation in implementing OpenCL. And of course they are not about to do so in a useful manner when they are leading with CUDA.

Also, the premise of OpenCL is somewhat faulty. You end up optimizing for particular architectures regardless.


and relies on non-free components (CUDA).

Yeah, this is one reason I'm really hoping some of the stuff AMD is pushing, in regards to openness around GPUs, gains traction. And why I am hoping OpenCL continues to improve so that it can be a viable option. Being dependent on nVidia for all time would blow.


The use of the gplv3 allows gnueral to have, as a dependency, any of the apache or permissive licensed tools like TF, torch, etc, and then through those tools 'export' their dependence on non-free components from nvidia and others.

I don't think this is wrong, per se, but it is ...funny when the fsf portrays their work as morally superior to us horrible corporate permissive license lovers, while inexorably depending on non-free components.

In an ideal world this project will be popular and will lead to someone on gnueral writing nvidia compatible drivers that will allow them to reject nvidia's, but I'm not optimistic. Not because of some incompetency on the Gnueral team, but nvidia's long history of making life very difficult for open driver writers.


Does the FSF really depend on non-free components?


It's possible to run any of the "major" neural network toolkits (Caffe, Torch, Theano) on CPU-only systems. All of them are permissively licensed (to my knowledge).

It will be prohibitively difficult to train the model without some kind of hardware assistance (CUDA). This means that if we're building an ImageNet object detector, even if the code implements the model correctly the first time, training it to have close-to-state-of-the-art accuracy will take several consecutive months of CPU time. Torch has rudimentary support for OpenCL, but it isn't there yet. There are very good pre-trained models that are licensed under academic-only licenses that also help fill the gap. (This is about as permissively as it could be licensed because the ImageNet training data itself is under an academic-only license anyway.)

I'm not sure what niche this project fills. If you want an open-source neural network, you have several high-quality choices. If you need good models, you can either use any of the state-of-the-art academic only ones, or you would have to collect some dataset completely by yourself.


> This is about as permissively as it could be licensed because the ImageNet training data itself is under an academic-only license anyway.

Does this necessarily follow, that a machine-learning model is a derived work of all data it's trained on? As far as I know, the law in this area isn't really settled. And many companies are operating on the assumption that this isn't the case. It would lead to some absurd conclusions in some cases, for example if you trained a model to recognize company logos, you'd need permission of the logos' owners to distribute it.

(This is assuming traditional copyright law; under jurisdictions like the E.U. that recognize a separate "database right" it's another story.)


I'm not aware of the formal legality of it, but I don't see why it wouldn't be the case. Without the training data, the model can't work. That seems to fit the definition of "derivative work".


IANAL, but I looked at the definition of derivative work, and it seems really hard to apply to learning algorithms. But I'm going to disagree with you. I notice that US law mentions "preexisting material employed in the work". IMO a set of neural network weights contains no preexisting material at all. All the examples of derivative works include at parts of previously copyrighted works directly.

I'd like to note that some publishers, like Elsevier, allow you access to their dataset (full texts of articles) under a license with the condition that you can not freely distribute models learnt from their data.


Wrong, most do support OpenCL or at least have partial support. It's just much less supported because not so much people see too much benefit from it. Btw, it's all open source. If you miss some functionality, it's really easy to add.


Yeah, you don't depend on CUDA/cuDNN, but of course you can use them if you want it to be fast

But the CPU fallback is there


Its going to need to use CUDA or it will not be competitive with alternatives. CUDA makes training networks more than an order of magnitude faster.


But that may or may not matter, depending on what you're doing. And how often you do it. If I have a network that I only retrain once a month, I can deal with it taking a day or two to train. Heck, it could take a week as far as that goes.

OTOH, it obviously matters a lot if you're constantly iterating and training multiple times a day or whatever.


The difference is between training taking a week, and training taking 10 weeks.

It takes a week to train a standard AlexNet model on 1 GPU on ImageNet (and this is pretty far from state of the art).

It takes 4 GPUs 2 weeks to train a marginally-below state of the art image classifier on ImageNet (http://torch.ch/blog/2016/02/04/resnets.html) - the 101 layer deep residual network. This would be 20 weeks on an ensemble of CPUs. (State of the art is 152 layers; I don't have the numbers but I'd guess-timate 3-4 weeks to train on 4 GPUs).


For state of the art work "a day or two" is pretty fast for a production network, and that's on one or more big GPUs. Not using CUDA is definitely a dealbreaker for any kind of real deep learning beyond the mnist tutorials. It's common to leave a Titan X to run over a weekend; that would be weeks on a CPU.


Well not using CUDA isn't necessarily synonymous with "use a CPU". There is OpenCL. But still, you have a point even if we might quibble over details. This is why I am very much hoping AMD gets serious about Machine Learning and hoping for OpenCL on AMD chips will eventually reach a level of parity (or near parity) with the CUDA on nVidia stuff.


Its unlikely that AMD is going to be able to make serious inroads in the near future. nVidia has built quite a lead not just in terms of chips but tooling. I had thought a couple of years ago that AMD should be building a competitor to the Tesla. It should be able to build a more hybrid solution than nVidia can given its in house CPU development talent. But I haven't seen them building anything like that and a competitor to nVidia may have to come from somewhere else. In the absence of a serious competitor OpenCL is not very interesting.


Yeah, and that's sad. I really hate to see this whole monoculture thing, especially since CUDA isn't OSS. :-(


Its really a hardware problem.


Why not focus on adding GPLed code to an existing package with a GPL-friendly license?


FSF want to be copyright holder for all of it projects code so it's possible to relicense codebase under newer version of GPL. For same reason everyone contribute to their projects must sign CLA.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: