Hacker Newsnew | past | comments | ask | show | jobs | submit | svisser's commentslogin

The word "translated" isn't correct here, as the articles may not be translations of other articles.


I concur. Most Wikipedia articles in different languages on the same topic are not translations of each other, in my experience. It is better to think of them as independent of each other.


I've often wondered if that's a good thing or a bad thing.

It feels like reading through Wikipedia, I'm missing some specifics, details or even points of view on a particular (international) topics when I'm reading it in English. I was reading about a town in Estonia recently while trying to track down some ancestry and while the English page had limited information, when I switched to Estonian and used google translate, I was able to find a ton of detail. I see the same when reading about smaller towns in India or non-English literature.

Would some sort of auto-translation and content augmentation (with editorial support) be useful here.


If you speak multiple languages (or are willing to read machine translation) you can often get a much richer understanding of a topic by reading wikipedia in multiple languages. Wikipedia strives to be unbiased, but obviously that's a goal, not a reality. But different languages are biased in different directions. Even on articles of comparable length the articles often emphasize very different parts, and deem different facts relevant.

And sometimes there are facts that are just less relevant in certain languages. The English article on the model railway scale HO spends the majority of its introduction on a lengthy explanation that HO stands for "Half O", and the O scale is actually part of the set of 0, 1, 2 and 3 scale, but English speakers still use the letter O. Which is important to note in an English article, but completely irrelevant in the majority of languages that don't share this very British quirk and call it H0 instead.

Cultural diversity is a big strength of wikipedia. Turning everything into one smooshed together thing would be a travesty. Making the various different versions more accessible to readers would be helpful, but it would also dilute the diversity as it would certainly bring more editors of one language into other language versions of the same article, leading to more homogenized viewpoints and a view that's even more dominated by the most active wikipedians (presumably Americans and Germans)


> Would some sort of auto-translation and content augmentation (with editorial support) be useful here.

The Wikipedia folks are working on this, but planning on auto-generating natural language text from some sort of language-independent semantic representation. This avoids the unreliability of typical machine translation and makes it comparatively easier to have meaningful support for severely under-resourced languages, where providing good encyclopedic text is arguably most important. Details are very much TBD but see https://meta.wikimedia.org/wiki/Abstract_Wikipedia for a broader description. If you prefer listening to a podcast episode, see e.g. https://www.youtube.com/watch?v=a57QK4rARpw


In my experience, "simple translations" are even explicitly discouraged in at least some language versions of Wikipedia.


They are functionally completely independent. And it's true that translation is not necessarily the way foreign-language articles get written. But it's also true that editors can often lift whole sections out of one project to translate into another project, by way of augmenting the article content. What's terrible is when this is bad content, because errors can easily be propogated in very persistent ways!

I need to qualify that independence, though: all Wikipedia projects draw from at least two common sources: Commons and Wikidata. Now, Commons is media-oriented, so most of the images and audio provided may be language-independent. Wikidata is something that's not well incorporated in the English Wikipedia, but heavily used elsewhere. For basic facts and data points, Wikidata is a common database that levels the playing field for all projects and helping them expand equally without monumental individual efforts in each community. If you, as an editor, wish to truly make significant changes happen, you can do it from "behind the scenes" in one of these two projects.

It is very advantageous to read articles in different languages if you're interested in deep dives on the subject. For each topic, figure out its native tongue and go there to find good info. You can even just Google Translate it back to English. And sign in to your account: the Wikipedia preferences permit you to use an interface language that you can understand.

You'll find that English Wikipedia editors have an aversion to non-English sources and citations. They're not prohibited at all, but they are not popular, because other editors like to read and verify them. Unfortunately, English Wikipedia articles on foreign topics can often be built entirely on Anglophone sources, making a very impoverished, insensitive, inaccurate article! I've found foreign-langauge projects which provide foreign-language sources to be really useful in this regard.


Yeah, this needs determinations as to which are translations of each others, and which are independent articles on same topics.


It's not exactly tolerated by Wikipedia community, see this discusssion:

https://en.m.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost...


Ironically, her husband's name is Robin.


Plot twist, her name is Batman!


Wikipedia itself is also working on a new UI:

https://www.mediawiki.org/wiki/Reading/Web/Desktop_Improveme...

The English Wikipedia isn't one of the "early adopter wikis" but if you visit, say, the French-language Wikipedia you can see the changes (e.g., different width to improve readability).


Every time I get to a French article somehow, I'm reminded that I should be dreading the day this gets merged into master. It's like this half-baked cards feature: whenever you search on the page and hit a link, that card will stay open indefinitely. You can't hit escape to close it, you can't hit tab to move off the link because then you'll just get another card for the next link, you have to go for the mouse. It was supposed to be an onhover feature for when you're using the mouse in the first place, instead it obscures text all the time. Luckily, cards are something you can turn off; with the narrow new style I doubt we'll have such luck. Language switching also became a mouse-requiring operation in the new design (or much more involved), other than that and the squished text I don't actually see any attempt at making it look modern.


Wow, that is shockingly bad. Different objects just floating in space without lines between them to visually differentiate them.



It's a lot of what m.wikipedia.com looks like, actually. Just hitting m. on any given Wiki article on a Desktop makes it so much easier to read in my experience. YMMV. :)


That is really bad, what were they thinking?


You should look into BDI agents and BDI programming languages.


I will. Thanks!


Being good at hacking doesn't mean they can write clean maintainable code (in a team).


It may be because it's the vault codebase, a tool for secrets management, encryption as a service, and privileged access management.

One would hope that such a codebase is well understood and implemented.


For context, Alibaba uses their own fork of MySQL which is AliSQL:

https://github.com/alibaba/AliSQL


I'm curious what would be their motivation for investing so much in MariaDB when they already have their own? Hedging bets? Encouraging competition?


Looking at their commit history, it's not a fork in the sense of "we're actively developing our own new features and diverging from upstream", it's a fork in the sense of "we've taken a stable release from upstream, applied some of our own patches, some backported patches, and some patches from other forks."

It's a fork as much as the stable packages that RedHat or Debian ships are "forks" since they have a few patches applied to customize for their environment and fix a few issues that hadn't made it upstream by the time they picked a stable release to base on.

MariaDB is an active, independently developed project. They are probably funding MariaDB because working with upstream developers of an actively developed fork can be more efficient than striking out on their own with another actual fork. I wouldn't be surprised if they eventually start using MariaDB, and either rebase their AliSQL patches on top of it or drop AliSQL altogether (much of it just looks like backports of bugs or features).


Their Github indicates there's cross-pollination:

> Moreover, it includes patches from Percona, WebScaleSQL, and MariaDB.


Getting out of the business of having their own?


With Alibaba cloud, infrastructure like databases is their business. Take AWS Aurora and GCP Spanner as examples.


But it's also easier to sell a standard DB people already know vs a custom

(see (google) Bigtable vs (amazon) Postgres)


Those aren't exactly comparable products - also, is it obvious which one sells better? Is it well known that amazon postgres sells like hotcakes?


Odd comparison, Google has supported MySQL for a long time, as has AWS.

Dynamo might be a slightly closer comparison


Google Cloud SQL supports postgres as well.


that's kind of my point

databases people know and use already (eg in fun projects) are easier to sell


Google Cloud SQL and AWS RDS now both support PostgreSQL.


FYI: Alibaba is also developing its own Aurora like DB.

https://www.yicaiglobal.com/news/alibaba%E2%80%99s-new-cloud...

A more detailed intro (in Chinese) can be found at http://www.infoq.com/cn/news/2017/08/ali-polardb.

Alibaba has developed another DB called OceanBase (open source). https://news.ycombinator.com/item?id=12925679


Do you specifically want it to be about Python scripts or also other languages?


Yes why only Python?


The opening sentence of a linked article sums it up quite well: "If nobody understands a mathematical proof, does it count?"

The abc conjecture may be solved today but only when a sufficient number of people understand and accept the proof as a proof.


Formalize it with Isabelle / Coq / Lean. Then it counts.


Math is more about figuring out why things are true than what is true.

Mathematicians want to understand Mochizuki. They couldn't care less if hn thinks it counts.


I don't know why this got flagged.

At the point that we no longer understand the relationship between the formal proof witnesses (and really, the class of possible witnesses) and the axioms we choose, we can no longer do mathematics, because we can no longer meaningfully explore axioms -- our ability to make guided changes is destroyed by our inability to understand their effect.

It's important for the community to understand something of why a thing, not just that it's true, because that's why drives the development of mathematics forward. (And indeed, particularly so in the ABC conjecture, which sits at a node between the nature of multiplication and addition, which don't usually have much to do with each other.)

I actually wonder if US (and perhaps other) math education is harmful here: the focus on rote learning and just knowing that a thing is true (to mechanistically apply it) has conditioned people to not understand why the hesitance over proofs that humans don't understand -- for most of those people, they never understood the proofs anyway.


That is an excellent point. The future is mechanical proofs. Which will bring tools for proof search, proof refactoring, proof minimization, proof navigation, theorem generation. We'll be able to tackle _much_ harder problems, while still being able to get the gist of it.

Sometimes I'm saddened that this future may come slower than we'd like due to imperfect funding structures. But I've grown a lot of patience over the years :)


I'm actually very pro-formalization and mechanical verification -- both for mathematics and computer science. $HOBBYPROJECT involves automated theorem proving, while I'm trying to convince $DAYJOB to adopt some formal methods.

I was just pointing out that the person got flagged for commenting that "witness and dump" isn't actually very useful for mathematics as a field, except as a signal that we should investigate a topic further. But in the case of the ABC conjecture, we already have plenty of incentive to investigate.

I think mathematics and science have a lot of learn from computer science in terms of managing large models, proofs, etc -- and that we'll get a lot of automatic tools. That will all be really great.

But there are proofs that are basically just brute-forcing a solution for which we have no higher-level understanding, and those don't really add much by way of knowledge to mathematics. At the point that those are all we can generate for "big" problems, we may be in trouble.


Another excellent point. Right now it's a "winner takes all" competition. It matters to prove a result, and much less to provide an "elegant" proof. I can only hope for a future where we measure the algorithmic entropy of a proof [log proof length][0], and results like "ABC theorem proof using half the bits as best known proof" become notable.

[0] https://en.wikipedia.org/wiki/Kolmogorov_complexity


I'm interested in your $HOBBYPROJECT. I'm hoping to develop some software similar to edukera.com, ie using proof assistants as educational tools for mathematics. Would love to swap some cool links and references. Thanks!


I'm actually in the process of (poorly) implementing my own proofs engine to try and gain a deeper appreciation for the notions behind type theory (and how eg Coq operates):

That a function is the same thing as a proof, with the theorem it proves being the type it constructs and its assumptions being the types it takes in. (Functions are proofs (which in this case, are things that take witnesses of their assumptions and produce witnesses of their theorem), values are witnesses that something is true, types are theorems, etc.)

I have an interest in topology, and want to understand HoTT, but my intuition around type theory wasn't up to par -- so I'm trying to tackle it in a constrained setting (ie, not proving theorems about mathematics as such, but a narrower problem space). Figured there was nothing to do besides get into the messy bits of it.

Can you tell me a bit more about what you're trying to do?


Hah, I'm in the same but opposite situation: I, too, am working on a Coq-like proof assistant, but I understand the type theory far more than the topology needed for HoTT.

Do you have any suggestions for simple introductions to HoTT, especially for someone without the topology background?


My background may be showing, but it may be hard to appreciate HoTT without understanding the role of homotopies in topology. My (limited) understanding of the material is that it's an attempt to introduce a homotopy structure on the type system, and then use that to talk about logical equality (mumblemumble) being the same as homotopy equivalence.

It's then using that equivalence structure between the proofs to reason about constructing proofs, as you can reason about the constructions that are possible out of classes of proofs. And that's basically where I get lost, because I don't quite know enough type theory to understand the structure they're trying to build, so I can't quite get the specific motivations. (The obvious high-level one is better formal reasoning.)

I haven't been following HoTT super closely for a year or two, getting sidetracked into the background, but last I checked there wasn't a ton of simple material on it -- it was sort of read the book, read the Coq proofs/framework, and figure it out. (Though, this easily could have changed.)


Fully agree. But person probably got flagged for commenting that what is important to hn may not be what's important to mathematicians.


> The future is mechanical proofs.

That almost sounds like "the future is mechanically composed novels" (or music). Understanding why something is true is just as important as knowing that it is. Mechanical proofs will be impossible for humans to understand, so the value of such proofs will be rather limited (in that people will still continue searching for a "real" proof).


Mechanical proofs and understanding are not mutually exclusive, that's the whole point! Just like mechanical processes [aka software] are still understandable. A stronger conjecture is that mechanical proofs will enable _better_ understanding, as tools and metrics are developed for tackling "understanding".


Have you ever read a mechanical proof or studied mathematics at the graduate level? It's incredible difficult to prove even basic things in the current mathematical proof dialectics and reading the proofs is even more painful. Maybe there will a come a time when mathematical proofs are helpful, but it's not a particularly active area of research and the amount of work to be done in it is monumental in order to get to the point you're proposing.


I stole a small part of http://compcert.inria.fr for a hobby project. While this is not "graduate level mathematics", it is "graduate level computer science". I could have never done it if I had to memorize half a book of definitions just to warm up.


> The future is mechanical proofs.

For some value of "future" - sure, maybe. But the vast majority of theorem solutions do not lend themselves to mechanical proofing, and it takes great effort to do it at all.

I don't really agree with your thesis here. I don't see how we're going to get to a future where we're tackling much harder proofs, because the hardness is already what prevents them from being so easily mechanized and checked in an automated manner. We have no problem coming up with new theorems, either - half the job of a pure math researcher is coming up with interesting questions that are too hard to solve immediately but not quite so hard that they're inaccessible.

The only way to make the proofs "harder" is to make theorems that are even further removed from our current mathematical capabilities. Otherwise you're stepping into the undecidable territory. Simply put - solving open problems is a major activity in mathematics because it develops new mathematics, not because the actual end result is useful in of itself. If I prove to you that Pi is normal, the fundamental contribution is (hopefully) a method that is partially or fully generalizable to other domains. No one really cares if Pi itself is normal, and most already expect it to be. To mechanize that process (if it's even possible at all) requires that the mathematics for solving it already exists, which means that the problem is most likely either 1) uninteresting, 2) overlooked.


People have a lot of very valid complaints about common core, but at its core (hehe) it does seem to attempt to start to remedy your complaint about US math education at least at the high school level.

Hopefully that can propagate upward.


I don't see these approaches as necessarily opposed. It seems like in principle, formalizing a proof and verifying that it doesn't have any errors could be a preliminary step towards understanding it? (Though in practice, perhaps this is overly roundabout.)


Also, once enough people figure it out, one (or some) of them can make better explanations.


Indeed. We now actually have some math results of interest for which we have only formal proofs and no human proofs. https://arxiv.org/abs/1509.05468 is an example.


We are very far from the point where that is feasible for many new results in math.


>"If nobody understands a mathematical proof, does it count?"

Just wait until general AI really kicks off, then all of new math will be like that. It's not that humans are bad at logical thinking, our weakness is memory. That won't be the case for an artificial agent with instantaneous perfect recall of everything it has ever seen.


lol. I think the resurgence of interest in constructionist mathematics via Homotopy Type Theory will lead to better proofs ( since all proofs in HoTT are like programs and can be computationally verified in a straightforward way ).


Even an AI will have various levels of cache. Some memories will be register-level instant, some will be thousands of miles and quite a few servers away, and others in between.


That's a good point. Lets throw "instantaneous" out, having the ability to store and perfectly recall data is a critical advantage for any thinking entity.


>instantaneous

as the memory will exceed ram and then disk-space, that will be less true.


Mathematics is a language. You don't really need to memorize it, you can read it. Unfortunately, most people aren't exposed to anything higher than arithmetic.


yeah, well, have fun reading e.g. the stacks project[1] of round about 4000 pages without remember the necessary steps leading up to a corollary.

[1] https://stacks.math.columbia.edu/browse - abstract algebra as far as I can tell


You're sure as hell not going to memorize it.


I remember having to memorize the multiplication table. And that was just the beginning...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: