Scrabble GM vs. AI -- the Rematch! Game #10
Vložit
- čas přidán 18. 06. 2024
- The Scrabble AI BestBot got the best of me in my 100-game Human vs. AI Ultimate Scrabble Battle, but I'm not ready to cede to our AI overlords! Introducing... the GM vs. AI rematch!
This 100-game series, running every Monday and Wednesday at 5pm ET for 50 weeks, will feature 20-minute games against BestBot with post-game analysis. Hope you guys enjoy, and wish me luck!
BestBot is the upcoming ultimate Scrabble AI from Woogles.io, to be launched in 2024. For questions, please email woogles@woogles.io.
Want personalized help taking your game to the next level or a fun gift for a friend? Check out www.mackmeller.com/lessons! for more info or email me at mackmeller@gmail.com! - Hry
Definitions of Interesting Words Played in Game 10:
FETERITA [90 pts] (noun) - any of various grain sorghums that are derived from a Sudanese sorghum (Sorghum vulgare variety caudatum) and are characterized by compact oval heads of exceptionally large soft white seeds [from Sudanese Arabic]
SKIDDER [81 pts] (noun) - a worker or machine involved in dragging logs, often using a system of cables to move the logs across the ground, or a vehicle equipped for moving heavy logs or similar materials at a logging site
HOURI [32 pts] (noun) - one of the beautiful maidens that in Muslim belief live with the blessed in paradise [from an Arabic-derived Persian word that went through French]
DRAWEES [95 pts] (noun) - plural of "drawee"; the person on whom an order or bill of exchange is drawn
TURION [25 pts] (noun) - a scaly shoot (as of asparagus and some duckweeds) developed from a bud on a subterranean or submerged rootstock [from Latin]
SENTS [17 pts] (noun) - plural of "sent"; a monetary unit equal to ¹/₁₀₀ kroon used in Estonia from 1928 to 1940 and from 1991 to 2011 [Estonian, from a Latin-derived Finnish word]
SENT in that sense of the word is actually from the same root as English cent. That makes sense (or maybe it makes SENTS?).
@@JDHinten That root being Latin centum = hundred - also present in words like centimeter, century and centurion
@@annayosh The Latin word "centum" was originally pronounced "KEN-tum," with a hard "k" sound at the beginning. Over time, the "k" sound gradually assibilated, becoming "cent" (with a soft "s" pronunciation) in Old French before entering the English language.
@@JDHinten Although Estonian is a non-Indo-European language (it is a Uralic language), it has borrowed many Latin-derived words into its vocabulary, "sent" being one of them.
Wow, that tip about Lonestar indicating no 8s are available would make for a very interesting video. What other non-bingo-indicator words do you use?
Would love that video
I'll think about it! In my last series DEELMNTO came up and I was like "Well I walk by DEL MONTE brand stuff in the canned food aisle and know there's no anagram for that so not gonna waste any time here," so there's one more for you in the meantime :)
@@mackmeller only slightly related but i wonder if theres a query that can give us some sort of "most common phonies" or something like that (maybe in mid-low elo play) or "racks most likely to cause a phony" lol. would possibly indicate things like words that people *think* would be valid but are not...
my one contribution here is that I played UNITLESS once and was like 100% certain it would be valid because it's commonly used in science for ratios in which the units cancel out. lol. and missed the valid "UTENSILS", and now I'm never gonna miss that one again!
TRONGLE is one of my favorites. I define a TRONGLE as a promising-looking 7-letter rack that has no 7s in it and also no 8s with any letter! What are your favorite trongles?
@@thomascorey7284I tend to see “dimensionless” rather than unitless.
As a gambling man, I would’ve considered YORE for 24 saving GIR trying for EMBARKING on turn 2.
I was thinking (K)ERRY for 24 with the same idea, though GIO is a worse leave than GIR.
It doesn't hit or score enough to be worth it.
If I had DSNG or something, I might go for DISEMBARKING, since I could also hit something else with the S, but G is generally a really bad letter in this game.
That's not a bad idea! With 6 N's still out my odds of getting it are pretty decent. That being said I do still give up 7 points immediately and also hold the G, which is a bad tile if I don't draw the N, so it's close whether it's actually worth it. For what it's worth they sim within a fraction of a percent on Quackle, so could go either way :)
I like the strategy behind “cinq” even if it got blocked.
Losecrafting 😂
I'm getting annoyingly good at it this series, aren't I?
8:40 how the hell does he know exactly how many points that would be that’s insane
After a while of playing it's easy to count the base value of the word basically instantly (18 points in this case), and because 9-timers always score the same, it's a familiar calculation to Mack as well to get from 18 points to 18 x 9 + 50 = 212. He's obviously a quick calculator but he's also encountered that particular calculation a bunch of times before, so it's more like memory, the same way that most people wouldn't have to think about what's 6 x 7.
I'm not saying it's not impressive though lol
Haha, yep it's pretty much what Alex said -- there are not very many possible triple-triple scores, and they're memorable enough when you get them that they're pretty engrained. Like if you asked me what the value of a triple triple with 12 total points in the bingo is I could tell you 158 instantly without doing any thinking. As for EQUATORS, I see the Q and know it's 10, and the remaining tiles are 8 since there's 7 one-pointers with one on a DLS, so all I really need to do is get to 18, and as I described above the step from 18 to 212 isn't actually math, it's just muscle memory so to speak.
@@mackmeller oh ok that makes sense
Unlucky Mack, I think this has been one of your highest accuracy games based on my macondo analysis:
9F GREY vs 9H YORE is a significant consideration, GREY wins the standard sim 37.11% vs 37.03% without adding inference, but with inference 9H YORE wins 37.55% vs 37.29%. Intuitively I prefer YORE but what do I know about scrabble ;-;
Exchange IIO sims 2nd at 19.16% behind J5 OI (19.85%). Both (T)ORII/(t)ORII options sim worse than exchanging.
SkIDDER sims best! It beats out REDDISh by .1% (16.55% vs 16.44%) and DESIReD by 1%.
(C)INQ sims 6th, a mere 0.6% behind (J)O in terms of win% despite also having the best sim equity. However, at this point all options had win% below 12% so.... yeah. And after DRAWEES, there was no chance left for you :
Thanks! Yeah I did feel I played a very good game here, but what can you do, sometimes the luck just doesn't quite cooperate. Next time!
Another speedy one :) exciting!
Interestingly, and it wasn't even listed among the top equity moves for obvious reasons (it's only 11 points), but you actually had a stylish nine on the LA turn: (G)ESTAL(TE)N, an alternate plural of gestalt.
Whoa, that's pretty cool, nice spot!
if i'm still allowed to make a prediction i think Mac's record will be 42 - 58, -2415
I think it would be interesting to analyze these games from a more mathematical perspective:
1. In what proportion of the game are you (or the bot) up by move N, N=1,2,3,...
2. How does being up by move N impact win probability?
The motivating factor here is the fact that I can't seem to recall too many games where the lead is exchanged back and forth, and it seems a lot games are decided fairly early on. If woogles has a public API and I can query this data (assuming no one has done it yet), I would love to look into this over the summer
Rather than measuring by turn, it would be better to measure by number of tiles played, that gives a more correct measure of how far the game has progressed.
That's a very interesting question, I don't know the exact numbers but it's definitely a lot easier to, say, get to a 100 point lead early on than it is to come back from a 100 point lead once you're already down. The reason is that your opponent, once they're up 100, will start playing defense and limiting your options. Even if they don't play defense that effectively, the board will naturally start to exhaust itself towards the later stages of the game, leaving you fewer options to bingo and get back into the game.
CINQ was an exciting move. And I liked the LOSECRAFTING SOLECRAFTING ideas lol
I checked, and a sent is 1/100 of an Estonian kroon - which was in use until 2001, when they switched to the Euro.
Correction: Actually, Estonia switched to the euro in 2011 [see my comment]
@@AugustusMatthias Glad to stand corrected :)
@@AugustusMatthias Thanks for the correction.
Hmm, so back then you could've been like "I sent my friend a couple thousand sents on Venmo to cover dinner"! Oh wait, Venmo didn't exist in 2001 did it 🤣
Could you keep IG on the first turn going for (EMBARK)ING?
That's not a bad idea at all, in fact YORE sims basically neck and neck with GREY (see several comments above for details). YORE does give up 7 points immediately and keep the G, which is clunky if I don't get an N, so there's considerable downside if I miss but also considerable upside if I do hit, and it more or less evens out.
Was this bot trained on more advanced leave values than the ones you may have learned years ago? The leave evaluator on cross-tables is based on TWL 06 vs. Quackle. As helpful as it may be, it is outdated. Your word knowledge and Best Bot's word knowledge don't seem to be that far apart but it seems to bingo earlier and more often than you. Maybe that's the reason.
There's also the fact the bot has knowledge of the entire bag... the architecture should be changed so that woogles server only sends the bot the letters it drew, not sending it the entire bag and letting it pick letters (it picks at random, but from knowing the bag it can infer Mack's rack and hence make blocking plays)
@@almightyhydra Please let this insane conspiracy theory go. If the bot inferred Mack's rack on every turn it would win every.single.time. And the code is absolutely crystal clear that there are no shenanigans going on.
It's possible, though I think it would be a bigger difference with HastyBot since HastyBot actually plays on pure equity/static whereas BestBot plays based on sims. So BestBot very often doesn't make the highest-equity play from the analyzer, as is probably readily apparent from the analysis portion of these videos. I can definitely say from experience that I often remember thinking "dang HastyBot kept so many vowels" in my 100-game Mack vs. Machine series, but I rarely have that same feeling against BestBot.
@@almightyhydra THE BOT DOESN'T HAVE KNOWLEDGE OF THE ENTIRE BAG
Am I misunderstanding something here? There are many 8-letter words with a rack of AELNOST. An anagram solver produces ANEThOLS, ANOLyTES, ELATiONS, EThANOLS, LAcTONES, LApSTONE, NOTAbLES, STONAbLE, TANgELOS, TOENAiLS, iNSOLATE, and pOLENTAS, alhough some may be dictionary-specific.
He was referring to the rack with an R on board
@@almightyhydra OK thank you.
Maybe if you had magic Nigel powers you could have won
We could all use those couldn't we!
I don't think you can afford to play scared and score 0 instead of taking 9 points and leaving eiir?. You don't need a perfect rack to bingo.
OI also massively kills the northeast quadrant for bingoes.
nothing Mack can do, he is a no win situation. best bot is rigged. how many times do we have to see it to not know something nefarious is happening
look up the woogles recent blog post about how BestBot is not rigged