Number 1 and Benford's Law - Numberphile
Vložit
- čas přidán 19. 06. 2024
- Why number 1 is the "leading digit" more often than you may expect?
More links & stuff in full description below ↓↓↓
See us test the law using Brady's CZcams viewing figures at: • Brady's Videos and Ben...
Blog about all this at: bit.ly/benfordslaw
Brown Paper from this video on ebay: bit.ly/brownpapers
This video features Steve Mould: www.stevemould.com/ and / moulds
NUMBERPHILE
Website: www.numberphile.com/
Numberphile on Facebook: / numberphile
Numberphile tweets: / numberphile
Subscribe: bit.ly/Numberphile_Sub
Videos by Brady Haran
Patreon: / numberphile
Brady's videos subreddit: / bradyharan
Brady's latest videos across all channels: www.bradyharanblog.com/
Sign up for (occasional) emails: eepurl.com/YdjL9
Numberphile T-Shirts: teespring.com/stores/numberphile
Other merchandise: store.dftba.com/collections/n... - Věda a technologie
This is why I think Benford's Law holds:
I felt like intuitively this should make sense but I couldn't explain why until I drew it out. There is a higher spread between the next instance of the same leading number the higher you go, making it less likely that you will actually have that digit as a leading number.
So for instance, if you start with 1 the next instance of that leading number is 10. There is a 9-digit spread.
You start with 2. The next instance of that leading number is 20. There is an 18-digit spread.
You start with 3. The next instance of that leading number is 30. There is a 27-digit spread. (Or 27 opportunities NOT to hit 3). AND SO ON...
You start with 9. The next instance of that leading number is 90. There is an 81-digit spread.
Same for the higher values:
You start with 1999. The next instance of that leading number is 10,000. There is an 8001-digit spread.
You start with 8999. The next instance of that leading number is 80,000. There is 71,001-digit spread. You're again, going to be less likely to get something starting with an "8" than something starting with a "1".
Thanks, this is exactly what I needed to get it to click! For some reason I couldn't intuit any of the explanations in either the video or other comments. Pointing out the (general) spacing between instances of numbers with the same initial digit highlights the sequential nature of numbers, which is apparently what I wasn't thinking of. Another way to look at what you've pointed out is that (generally) to get to the next number with the same leading digit, you must multiply by the base. This means that the smallest number will result in the smallest distance between instances. Now I see why it's so obvious to some people. It seems very obvious from this perspective.
is that due to some manipulation of concept? Since a spread from 1 to 10 primarily deal with 1 digit numbers, whereas 2 to 20 include both 1 digit and 2 digit numbers. Also 10 to 100 only include 2 digit spread, where as 90 to 900 deals more with 3 digit numbers. So can we infer that if we limit the distribution to be only 100-999, the Benford law would not be in effect?
I guess, the other way to explain this is, that the 1st numeric no is 1 and then 2 and so no... this it self is biased, we tend to start things with 1. Where as if we count backward, the reverse will be true for 9 :) say 9999, 9998,.... the probability of 9 will be highest at 1st and then it will gradually reduce :)
Or in other words i should say the probabilty to happen somthing (not expexted / strange) in life, for the 1st time will be higher then to happen the same things for the 9th time :)
A Ronse Do you think that digit keys of computer keyboards need to repairing due to proportionally usage according to Benford's law?
This video is about to be a lot more popular.
Lol queue the “here before 1 billion views” comment
Yes, yes it is. Some people know.
That’s what brought me here
So true!
dont get too cocky star fox!
it wont take too long before papa google and friends start scrubbing the whole internet of any damning arguments...
Latif Nasser's docuseries Connected brought me here. I was utterly mindblown.
Same!
Same here! I’m in utter shock
Me too
Sameeee
Same! Shook.
The last explanation seemed most intuitive to me.
Don't know if I'm the only one who noticed this, but at 0:20, the "L" in "Law" is actually the number 1. Very cheeky :P
I happened to have paused the video just at that spot to let it load, so I saw that. Very cool!
I 1ike it that way.
Very sharp of you to see that
Another way of looking at it: Each of the numerals only have an equal shot at being the leading digit if the distribution stops RIGHT before the next power of 10. Some examples of this would be if the distribution was from 1 to 9, or from 1 to 99, or 1 to 999, etc.
If there is any extra "spare change" added to the size of the distribution, such as 1 to 99 being increased so that it now goes from 1 to 135, then that means there are 35 extra spots in the distribution that start with 1 (i.e. 100, 101, 102... 134, 135). These new options now make the number 1 a much more likely choice to be the leading digit.
Distributions never end nicely right before the next order of magnitude, as the point right between orders of magnitudes is just an arbitrary spot on the number line [1]. This means there is almost always some "spare change".
As 1 is the first number, it is the most likely to be part of the "spare change". When we go from one order of magnitude to the next, we go from the 9's to the 1's (99 --> 100, 999 --> 1000, etc). So, if the distribution crosses multiple orders of magnitudes, 1 always has the best shot at being the leading digit.
2 is next in line after we leave the 1's -- 199 goes to 200, 1999 goes to 2000, etc. Thus, 2 has the 2nd best odds of being the leading digit. And then of course the same logic applies for 3, all the way down to 9 being the least likely.
This gives us the shape of the graph in the video!
[1] If we switch from base 10 to some other base, the points on the number line that mark the crossover from one order of magnitude to the next will all switch! But the number line itself hasn't changed -- we are simply relabeling the positions.
This. Thank you.
Benford's Law for numbers.
Zipf's Law for words.
Patterns are EVERYWHERE.
+Bari Tenor Or they're nowhere and are instead part of the human mind's ability to comprehend things :^)
So we impose our own internal reality on the world around us??? Nice.
Bari Tenor
Something like that. We take in information with our very limited senses, and our brain makes sense of it with very limited tools. It would follow that we do not experience reality, but rather that we experience our attempt at understanding information.
IDK maybe I'm just bored, forget it
+Bari Tenor They're not really caused from the same thing.
hunterofyou Oh, of course. I understand that. I just like the fact that there seem to be similar-looking patterns for both areas.
I remember watching this video years ago and now fate has brought me back for a refresher.
Same, but I came back for the comments.
In binary, the probability of the leading digit being a 1 is 100%.
+QuotePilgrim Yep, and log2(1+1/1) = log2(2) = 1 !
Can you think of a binary number that doesn't begin with a 1
0010 is the same as 10
+KingHades Na55 Use sig figs
Everything here assumes we're not using numbers that begin in 0
+QuotePilgrim What about 0?
+MK Hammer I realise this aha. I was just explaining why :)
suddenly relevant again
would love to see the analytics for this video some day
2020 election will sure introduce this law for a lot of persons.
That's why I'm here
Well that’s why I’m here
Wait so how do I apply this?
@@anamarte9859 you use log (n+1÷n) to get the number, I with my basically understanding if there is ever a number that is high then the total number of 1 then there is something wrong.
Same here
What’s the standard deviation and at what point does a deviation become a mathematical impossibility? Asking for my ballot counting supervisor
“Asking for my ballot counting supervisor.” 🤣🤡🌎
🐸🐸🐸
😂😅🤣😂😅🤣😂
5.5 standard deviation above the mean for voter turn out in Wisconsin.
My understanding the median voter turnout rates float around 63% with the standard deviation of 7.5... oddly enough some of the districts Biden won are five standard deviations ahead of the mean which is statistically impossible.
@@Anonymous______________ according to Wikipedia
"6σ event corresponds to a chance of about two parts per billion. For illustration, if events are taken to occur daily, this would correspond to an event expected every 1.4 million years"
I make sense of it by thinking about lining up an infinite set of rulers end-to-end, running my finger over the top, stopping after some random amount of time and then adding up all the 1,2,3's etc. I've passed over. Obviously it all depends on where you stopped on the final ruler, but on that last ruler you've most likely passed 1, and least likely to have passed 9.
Data sets which span a bunch of orders of magnitude and are not normally distributed are kinda like this; the numbers are finite and have to stop somewhere. But you have to think of each "ruler" as containing ~10x more data than the one before it (1-9 vs 10-99) . So the "final ruler" you stop on has great influence on the count as it has the most room for data.
Within this "final ruler" you'll have a bunch of 1's but few 9's as the 1's with "fill up" first as we assume the data to be relatively smooth and finite. If the 9's where to "fill up" it would mean data spilling over in to the 1's of the next ruler and that becoming the "final ruler". So the 9's never really catch up.
Ohhh this is a really clever and intuitive way of looking at it! Thanks!
Huh. I remember watching this years ago. Funny how some things come around again. I love this timeline!
Ridin' with Benford
Youd get in a car with a war criminal? Not safe, stranger danger and such
7:45 THANK YOU for this explanation! It makes so much sense when you think about it that way
Steve has a lovely voice and accent.
Steve is lovely
You're lovely!
J.J. Shank we are all lovely including you :)
When I fudge numbers on my tax returns, I always keep this theory in mind, and start it mostly with a '1'.
Keep the greed at bay and nine away.
To me this was one of the most incredible videos on this channel, I really had no idea about this law. Amazing!
I loved this video when it came out. It better stick around now.
Who’s here after the 2020 Presidential Election? 😂😅
Me, trying to figure this out.
These fuckers need to be dealt with
Me.
@@WilliamFBogey cope
@@1234SLUR biden shills on youtube too x)
I got a soft spot for scale invariance so I really enjoyed this. One of my favourite numberphile videos.
Honestly this comment is 8 years old and I have no clue if you still use this account but omg that just sounds so cute. "I got a soft spot for scale invariance". ☺️🥰😊
This video is awesome! You explained Benford's law clearly in less than 10 minutes while some Netflix episode can't do it in 45 minutes.
Stand-up Maths just put out a great video explaining this in relation to the 2020 election results
Something tells me this video is gonna get A bunch more views here in the next couple days lol.
And then get taken down for misinformation 🤣
Matt Parker has already debunked that, check his video
I love when you relate your cool math tips to finance! It makes it so much more fun to learn! Tax advice, NASDAQ example, you should do more math/finance stuff! :)
great explanation and insight into benfords law. well done numberphile guys
Make Benfords Law Great Again!
Wow. This guy is so awesome! I wish i could draw such a large graph with such accuracy. This is so neat.
great - glad to have helped.
heads up, you about to get more popular :)
So happy that I am the owner of this piece of paper! A nice piece of numberphile history from yet another great video
Fascinating, I do like your enthusiasm.
What about CZcams video views and subcriber counts? Will it work or will we detect fraud?
You could check by looking at your subscribed list, it shows the channels and sub counts, or go to say pwediepie videos, as he has a massive collection spanning many years and check views or likes or what not easily
radiolab's "numbers" episode mentioned this and it also had a segment where it seems that humans are born thinking "logarithmically" and later learn the conventional "counting" method of numbers which could explain why some people think that it would be natural for the digit "1" to be first more often.
People indeed seem to think in exponents. Thats the only way to grasp existing differences in the scale of the world.
Very well explained. Thank you for this.
I knew of this, and for the first 7 minutes I was wondering how it would work in other bases... and then you answered it. Nice!
I'd love to see a video on uncomputable numbers. You could do a series starting with the integers, then moving on to rationals, reals, complexes, and computables/incomputables
@3:12 lol , I know it, you know it, we all know it! right?
you know why you're here XD
Brilliant and succinct explanation. Saved me 9 minutes. Thank you.
great video brady. and great explanation. this would be a cool brown paper to have
Came back after the powers of 2 video, it's still hard to believe!
Do powers of 2 obey the Benford's law? 1, 2,,4,8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192,16384, 32768, 65536, 131072, 262144, 524288, 1048576, 2097152, 4194304, 8388608,16777216, 33554432, 671008864, 134217728, 268435456, 536870912...when will I get a number that starts with 9?
I'd say Benford's law is intuitive. 1 is the default number, it's easiest to reach. It gets harder and harder to reach later numbers, so I'd expect a consistent downward curve, which is what the video shows. Fun stuff.
+Ofordgabings Yes lets suppose that for a distribution the probability of x decreases as x gets larger this is true for several distributions(at least above a certain value) eg.think of countries it is harder to get to 90 million people than 10 million people or 900 million people as opposed to 100 million and thus the figures with the 1 in front always get the largest probability and then 2, then 3.
Wow, great explanations! Thank you!
Great video! I love how you explained it. I was one of the dummy's like "whaaaatttt noooo wayyy, even distribution"
This video will get censored or demonitized
Thanks for explaining this. Saw this idea on Connected but felt unsatisfied with their insinuation that this was some kind of proof of numerical "fate" ... this explanation makes much more sense to me... or at least I think it does 😅😂
Agreed. I saw this on Connected as well and was completely fascinated by the "fate" aspect they presented. I now know they went that route purely for entertainment purposes as this video more clearly explains Benford's law in a much shorter video length. Interestingly the popularity of this video has sky rocketed after our recent US election from people who do not fully understand Benford's law and are using it as "proof" that Trump actually won the US election. Ughhhh!
I work for a broadband cable provider in the US, assisting customers via web chat. I checked the chat session times (in seconds) and sure enough, the distribution followed benfords law almost exactly. This makes a lot of sense because average session time is around 960 seconds. There's a lot more chats in the 1000-2000 second range than there are in the 900-1000 second range
Fascinating. Informative. Enriching. Gracias
I wanted to see this for myself, so I did this in Excel. In column A, I put all numbers (1 to whatever) and for every number I looked what the first digit was. At 1, I only found one number starting with 1. At 2, I found one number starting with 1 and one starting with 2, and so on. For every number I calculated the percentage (also counting the previous percentages). I did this for the numbers 1 to 9,999. I got a beautiful graph with in the end the following percentages:
1: 24,1547%
2: 18,3273%
3: 14,5466%
4: 11,7363%
5: 9,4973%
6: 7,6356%
7: 6,0420%
8: 4,6489%
9: 3,4112%
Hi Koen,
I was just wondering how you made the graph as I am needing to make this for a maths project I am doing.
?? it doesnt work like that man- with a normal distribution like the one you mentioned the probalability of starting from any of the above numbers is equal, so i dont know how you got such a solution
@@krowa1010 I'm gonna be honest with you. This is a very old comment. I vaguely remember it. I can't really remember the video either. I'm not sure if I calculated something and my explanation is just bad or that I made a mistake. But reading this now, I can only say that you are right. There's clearly not a 25% chance that a random number starts with a 1. Maybe I'll watch the video again. If I find out what I actually meant, I'll let you know. Otherwise, we'll speak eachother in 5 years. ;)
Huh. Wonder why this could possibly be trending.
Seeing this video for a second time makes me feel old. 7 years ago didn't feel that log ago, the video quality says differently.
What interesting statement: "We usually don´t hang on with nines"... Great video, Brady.
We're about to have a civil war over this lol.
“Benford’s Law, Right Flank! Standard Deviation, Left Flank! Charge!!!”
@TheKillSwitch they don't need to fight, cuz if we're being honest, you know the republicans aren't gonna do anything. Despite media claims of "right wing militias", it's the left that actually burns things to the ground and attacks people at random over ideology. A few exceptions on the right? Sure, a couple shooters here and there, but if we're talking about whole mobs burning down cities, that's 100% the left.
@@magicskyfairy69 It might come down to people being attacked regularly in pockets for stating political views, which is no way to live together. We're not far off from that reality already.
@@mdflonline so the right will be quiet. and our kids will be propagandized to join the SJWs. then there will be no more "right". Given the trajectory, there will eventually be communists, and socialists will be the new right wing. The idea of being a libertarian or conservative will just fade away.
For all of the people commenting about elections, see Matt Parker’s video on the exact topic
Yo, everyone scrolling past, do please comment on this to give it a signal boost.
@@clockworkkirlia7475 aight
@@clockworkkirlia7475 alright
@@clockworkkirlia7475alright
@@kookykutter69 alright
As of August, 2020 - the NASDAQ is about 11 thousand. This video has 1306 comments and 11 thousand likes (160 dislikes). The video had 656 thousand views. And Numberphile has 3.4 million subscribers.
Frank Benford: ironic that he was born in PA- no? He might be The American Patriot we never knew we needed.
@3:12 Look at the example article. Do you believe now?
I’m gonna need him to explain the election results now lol
Matt Parker has already debunked this fraud claim, check out his video
@@justanormalyoutubeuser3868 thanks, I’ll check it out
The way that it intuitively made sense to me as soon as he said the law was as follows:
As we go up through the numbers we could stop at any point (this is the highest value in our set). No matter what the highest value is, the probability of any value below it starting with 1 can never be beneath 11%. However in most cases it will be above 11%, because we'll have gone through all the 1s but not all the other numbers.
Funny how they basicly described the way I was thinking about it at the end :) Nice video
Souldn't the log scale pattern start at the top in a W type pattern instaed of M, as the first measurement would be one digit of 1, thus 100%, the next being two digits of which 1 is represented by 50%. Not argueing the formula or the maths, just wondering about the log chart.
Yeah, I noticed that too. Even starting with the raffle example, it should begin at 100%, then decrease towards the 10% range.
Yes, I also thought it would be a saw tooth wave, but mirror image of what Steve have show. I mean sharply increasing then falling with a decay constant, not the opposite. Because the moment you reach an integer of 10, probability rapidly increases and then continue to fall slowly upto next rapid increment.
Who came after after Stand-Up Maths video???
Well, I did anyway
I did! This is one exhausting comments section huh.
@@clockworkkirlia7475 Trump fanatics watch one video on a law they heard about two minutes ago and suddenly receive Nobel prizes in statistics.
I did.
@@dandelo6479 lol
This was fun. Please do some videos on other maths tricks used in numerical forensics!
The last description is something I have noticed so I knew that 1 tend to hang around a lot more then 9. Simply put: Everything starts with 1 so 1 should be more common. But good video that gives several different perspective on this phenomena.
For the election data gathered by county, you have to realize that most counties are cut up into roughly equally populated zones. So immediately that's problematic for Benford's law, because you need a very wide distribution of numbers, and counties are PURPOSEFULLY made to be roughly equal in population size.
The election returns for Biden by county weren't orders of magnitudes different from each other, they're within 100-1000. A wider distribution is REQUIRED to trigger Benford's Law. Keep in mind, the law only works, because "leading 1s" return every time you hit a new order of magnitude. It was not triggered for Biden because his performance was consistently within a single order of magnitude. Since this condition is not satisfied, you have to look at the last digit of the county returns, which should be evenly distributed 1-9, and indeed they are for Biden.
Donald Trump's returns do follow Benford's law, because his range of performance was larger by county-10s-1000s-spanning two orders of magnitude. This is why his results DO satisfy Benford's law.
EDIT: I said counties when I meant districts
@Hyakkaten Trump's victory would defy logic. Luckily he didn't win, so we get to keep logic
I love how his beard matches his jacket.
Mouldy old video ;-). Glad to see both these tubers improve their delivery over the years
I love how this video feels like it’s been shot the morning after a wedding and everyone else in the hotel is still in bed
You can see him in 3:12
hi /pol/
Hello there
This is just awesome!
My favourite host, dark, relaxed and charming, along with Matt Parker.
Benford Biden
The election in 2020 may take way longer than before.
@SmoovCat biden was confirmed winner
@SmoovCat literally every time the media says the winner they get it correct
@@hamoshytube1853 George Bush vs John Kerry ring a bell? You're a literal NPC.
@@BasedBowlCutEnjoyer sorry bro i was born on 2006 and i am not american so i dont know what happened in america before 2006
@@hamoshytube1853 then dont make false claims
CZcams recommending this 7 years later now that Steve has a big following on his own channel
I loved that explanation with the lottery tickets, i can see why now! Thanks!
Biden watching this : come on man
Did you ever forsee this video being an integral part of the 2020 US election?! Haha
I'm generally lousy with math, but I can proudly say this made intuitive sense to me. It's easy to imagine that going from a per capita PPP GDP of 10,000 to 20,000 should be vastly more difficult than going from 90,000 to 100,000, because of how much harder it is to double productivity at any level than squeezing out another 11% - which then leads to being back at that numerical wall of having to double, triple, X4-9 again. What's neat is that this effect persists with different base sizes.
This became intuitive the moment I pictured log graph paper. Suppose that each power of 10 is 1 cm high, and you collected a random sample of heights from 0 to n cm and graphed them. The distance from 1*10^m to 2*10^m is about 0.30103 cm (for any m), so it makes sense that there would be about 30.1% of the numbers landing in these sections.
Another way to think of it is that 10^x for any x>0 will have a leading 1 if the decimal part of x is between .0 and .30103.
who's here after watching connected on Netflix? 😂😅
me! hahaha🤣
Me!
this explained it so much better and took away all the mystery lol
Me.
MEEE
Matt Parker made a video about this, about 2020 election and just watch it before checking out any comments below.
That is why I am here as well. It is worrying how many people think it is election fraud based on this.
@@zylan4967 :/
The Monty Hall problem is my favorite. It always seemed intuitive to me.
i love numberphile videos
Does Benford's Law remain true for different number bases? If you took data that conformed to Benford's law in Base 10, and converted it to Base 7, or Base 9, would it still conform to the law? What about higher bases?
one is still the most frequent number (it is the lowest number so you're gonna hit that if anything, and lower chances up and up. they dont say that but that is best explanation)
It remains true for all bases. The key to understanding it is that the distribution must be independent of normalization. For example, if you multiply all the numbers in your data set by a constant then the distribution of leading digit probabilities must remain the same.
Look, it's Harry Potter. Grown up and lost the glasses. :) Love the accent!
What I learned from this video: "No one is nine meters tall."
All joking aside, this is an incredibly interesting concept, and I'm glad I discovered the magic of Numberphile. Keep up the good work, guys!
elarsen7823
Once this is all locked down by all the teams.... I guess you will ready to launch the next internet!!
"You don't hang around the 9's" seems to be the best explanation to me
US Election 2020 has forced me to learn a Maths Law that I've never heard of and to be honest, I don't mind. The next two months are gonna be lit.
Is this similar to Zipf's law?
"Let's say there's a list from 1 to 900"
original quote.
That’s just the craziest thing I’ve ever heard. I’m really having a little trouble getting over it.
Hmmmm wonder what Detroit and Milwaukee’s voting charts look like
Just to be clear for everyone who doesn't understand the law properly. This law only works when you data is over many different orders of magnitude. Because precincts are all relatively the same size, it does not apply.
just compare previous years data. That will prove or disprove it. No need for guesswork when we can use facts
Can't you use the final totals instead of individual precincts?
That book seems very interesting!
Nice one, Eddi.
tfw your vote batches cluster around five
totally not fraud. nothing to see here.
Biden right now "WHY MICHIGAN & WISCONSIN!? WHY?!"
Smooth brain on the fixer team, rip democrats.
oops
Biden skipped two during the debate. He went straight from "Point one" to "and point three". Maybe that's why his fake ballots are throwing off the distribution.
7:14 huh. . .
1:38... I just want a clip of that
You're explaining something so easy.