More

soarerz · on Nov 12, 2024

> China easily comes to mind as a counter argument.

I mean, cultural revolution was still going on 50 years ago lol

soarerz · on May 26, 2024

If your baseline is Xtramath - when did you last try to use chatgpt to solve problems at that level? Certainly seems like they work

tw8345 · on May 26, 2024

Its because the problem you need to solve isnt that hard and already solved, it does not need to have a crazy complex novel solution. . All you have to do is present the problem and solution set. xtra math isnt some sort of complex system and it does not need to be, its stupid simple and does the thing its supposed to do.

soarerz · on May 16, 2024

The model's first attempt is impressive (not sure why it's labeled a choke). Unfortunately gpt4o cannot discover calculus on its own.

munk-a · on May 16, 2024

I think this is the biggest flaw in LLMs and what is likely going to sour a lot of businesses on their usage (at least in their current state). It is preferable to give the right answer to a query, it is acceptable to be unable to answer a query - we run into real issues, though, when a query is confidently answered incorrectly. This recently caused a major headache for AirCanada - businesses should be held to the statements they make, even if those statements were made by an AI or call center employee.

astrange · on May 17, 2024

The Air Canada incident happened before ChatGPT was released so I haven't seen a reason to believe AI was involved.

munk-a · on May 17, 2024

I can't tell if you're being sarcastic or not - but AI predates ChatGPT.

astrange · on May 18, 2024

Chatbot-style AI didn't, and certainly not one major airlines would be using for customer service.

Chinjut · on May 16, 2024

It's a choke because it failed to get the answer. Saying other true things but not getting the answer is not a success.

bombadilo · on May 16, 2024

I mean, in this context I agree. But most people doing math in high school or university are graded on their working of a problem, with the final result usually equating to a small proportion of the total marks received.

giaour · on May 16, 2024

This depends on the grader and the context. Outside of an academic setting, sometimes being close to the right answer is better than nothing, and sometimes it is much worse. You can expect a human to understand which contexts require absolute precision and which do not, but that seems like a stretch for an LLM.

phatfish · on May 16, 2024

LLMs being confidently incorrect until they are challenged is a bad trait. At least they have a system prompt to tell them to be polite about it.

Most people learn to avoid that person that is wrong/has bad judgment and is arrogant about it.

ifwinterco · on May 17, 2024

I think current LLMs suffer from something similar to the Dunning-Kruger effect when it comes to reasoning - in order to judge correctly that you don't understand something, you first need to understand it at least a bit.

Not only do LLMs not know some things, they don't know that they don't know because of a lack of true reasoning ability, so they inevitably end up like Peter Zeihan, confidently spouting nonsense

perfobotto · on May 16, 2024

This is supposed to be a product , not a research artifact.

chongli · on May 16, 2024

But most people doing math in high school or university are graded on their working of a problem, with the final result usually equating to a small proportion of the total marks received

That heavily depends on the individual grader/instructor. A good grader will take into account the amount of progress toward the solution. Restating trivial facts of the problem (in slightly different ways) or pursuing an invalid solution to a dead end should not be awarded any marks.

slushy-chivalry · on May 16, 2024

it choked because it didn't solve for `t` at the end

impressive attempt though, it used number of wraps which I found quite clever

photochemsyn · on May 16, 2024

I don't know... here's a prompt query for a standard problem in introductory integral calculus, and it seems to go pretty smoothly from a discrete arithmetical series into the continuous integral:

"Consider the following word problem: "A 100 meter long chain is hanging off the end of a cliff. It weighs one metric ton. How much physical work is required to pull the chain to the top of the cliff if we discretize the problem such that one meter is pulled up at a time?" Note that the remaining chain gets lighter after each lifting step. Find the equation that describes this discrete problem and from that, generate the continuous expression and provide the Latex code for it."

usaar333 · on May 16, 2024

Or.. use calculus?

It has gotten quite impressive at handling calculus word problems. GPT-4 (original) failed miserably on this problem (attempted to set it up using constant acceleration equations); GPT-4O finally gets it correct:

> I am driving a car at 65 miles per hour and release the gas pedal. The only force my car is now experiencing is air resistance, which in this problem can be assumed to be linearly proportional to my velocity.

> When my car has decelerated to 55 miles per hour, I have traveled 300 feet since I released the gas pedal.

> How much further will I travel until my car is moving at only 30 miles per hour?

xienze · on May 16, 2024

Does it get the answer right every single time you ask the question the same way? If not, who cares how it’s coming to an answer, it’s not consistently correct and therefore not dependable. That’s what the article was exploring.

sabrina_ramonov · on May 16, 2024

I labeled it choke because it just stopped.

HDThoreaun · on May 16, 2024

Right its the only answer that accounts for wasted space there might be between wraps.

fmbb · on May 16, 2024

Can it be taught calculus?

soarerz · on March 12, 2024

What is the cheapest way to capture similarity if not via dot product then?

scotty79 · on March 12, 2024

Instead of sums of multiplications you could for example use sum of squares of differences.

Means squared error instead of dot product, it's not cheaper but it's close

If you want to go cheaper you could use sum of abs of differences.

soarerz · on March 13, 2024

This is effectively "the same" as dot product.

For a lot of embeddings we have today, norm of any embedding vector is roughly of same size, so the angle between two vectors is roughly same size as length of difference that you are saying, and can be expressed in terms of 1 - dot product after scaling

latency-guy2 · on March 12, 2024

I don't have an answer for this really outside of silly ones like "strict equality check", but I assert that no one else does either, at least today and right now, and its an inherent limitation due to the nature of embeddings and the space it desires to be (cheap, fast, good enough similarity for your use case).

You're probably best off using the commercial suggestion, and if its dot product, go for it. I am no expert in this area and my interest wanes every day.

gajus · on March 12, 2024

Interested to know as well

soarerz · on March 6, 2024

> Apart from the bloat, the main problem of Microsoft LinkedIn is that it does not let you export your contacts' infos, which really is a must-have feature of a contact platform.

Platform lock in is certainly intended (even though it sucks for users)

soarerz · on Feb 23, 2024

Its including options and rsu?

dcchambers · on Feb 23, 2024

Yes, not all cash compensation, but that's still an insanely high amount considering the total revenue being what it is.

soarerz · on Feb 7, 2024

Its not because HN has better moderation/policy than others. Its because HN never needs to hit DAU/revenue goals.

qingcharles · on Feb 7, 2024

This helps, but the sites I ran never comingled revenue goals with content moderation and they still enshittified.

soarerz · on Jan 27, 2024

Consider euler circle https://eulercircle.com/ if the kid is interested to learn about modern math. They have online classes, and financials may not be a concern if the student is admitted and is strong enough.

soarerz · on Jan 8, 2024

would you have opposed industrial revolution?

vouaobrasil · on Jan 9, 2024

Yes, I would have. Absolutely. No industrial revolution means no climate change, global warming, and the habitat destruction we see today.

kapp_in_life · on Jan 11, 2024

No industrial revolution also means infant mortality rates of over 46%(in the US at least).

vouaobrasil · on Jan 13, 2024

By decreasing our own mortality rate, we have increased the mortality rate of every other species on this planet. How is that fair?

soarerz · on Nov 22, 2023

Sorry are you equating ai and software?

metanonsense · on Nov 22, 2023

When you look at some standard AI textbook, such as Russel/Norvig, you see that there is not much about being called „AI“. The simplest „intelligent agents“ are functions with an „if“ statement. The smallest Node.js application has more complexity.

SiempreViernes · on Nov 22, 2023

It's a useful tool when examining the impact on moral questions, so much of the talk about the transformative power of AI becomes more clear you give up the pretence that introducing AI creates a new class of moral actor that breaks the conventional chains of responsibility.

A recent example of how people try to use this mystical power of AI to absolve you of responsibility of your actions is how UnitedHealthcare, an organisation largely in the business of suppressing health care to those in need, introduced an atrociously bad "AI" to help them deny applications for coverage.

In that example it is very clear that the "AI" is simply a inert tool used by UHC leadership to provide the pretext they feel is needed to force the line workers to deny more care without the whole thing blowing up because of moral objections.

foverzar · on Nov 22, 2023

AI is software and AI is a term as broad and unspecific as "software".

38321003thrw · on Nov 22, 2023

Software is the purely informational elements and constructs of a computing mechanism.

Or

(more broadly) Software is any construct that is functionally equivalent to its description.

(edit)