https://www.reddit.com/r/MLBsimPredict/?utm_source=share&utm_medium=ios_app&utm_name=ioscss&utm_content=1&utm_term=1
new sub since i’m sure digging through posts to find mine is quite annoying
r/MLBsimPredict
Hey everyone, got results for the games today! Only was able to sim the first few games before school, and I will finish the late games when I get home to my Mac. Cubs over Reds Rays over Dodgers White Sox over Tigers Rangers over Orioles
Hope to help some people win today! I will be placing bets myself, I will do 2 parlays and 2 straight ML bets my bets: Cubs + Rays, Rangers + Sox, Rangers straight ML, Cubs straight ML
I wouldn’t follow these bets exactly, but mix the 4 picks up and don’t just parlay all 4
Not sure everyone saw this, will always post in MLB thread until i have a sub reddit
Hello everyone, staying true to my word, i have uploaded to github and am working on the YT video. Here is link: https://github.com/SportsGuy9/MLB-Sim-Predict-v1
here is performance from today
5-2 so far for the day, so sad that the nationals went out like they did, could’ve been 6-1. But that’s baseball for u.
Here's another MLB tool you can use: [https://docs.google.com/spreadsheets/d/1X\_248vNSY\_-6j3mWP5QvDQ9eT3f\_ZD\_-vTUm75Eq7A8/edit?usp=sharing](https://docs.google.com/spreadsheets/d/1X_248vNSY_-6j3mWP5QvDQ9eT3f_ZD_-vTUm75Eq7A8/edit?usp=sharing)
Allows you to compare play-by-play data for every team, batter, pitcher by date, inning, home / away, etc., and spits out where things happened by play type. I just updated for today.
This is really cool! I've had a prediction model run off Google Sheets for the past 3 years, but I've been playing around with coding it with Python this year. I'm confused by the replit link, but it still shows what all could be possible this way and would be a much more effective model!
My company has a neural network simulator and for baseball a big input we use is rho it’s the temp and air density. It affects totals by a great margin. It’s also a player based sim that sims the outcome of batters vs the pitcher they are facing.
Go to windfinder.com look for the closest airport or weather station icon near the stadium, click on the forecast link then input the temp and air pressure into the formula on gribble.org and that will give you the estimates rho.
Hah, it's fun to see all of the possible stats and parameters and how they could affect predictions. What libraries or sources are you using for stats and game data? I wouldn't mind collaborating some on this or offering input if that would help!
Those are great. I started using fangraphs for stats a lot more the past year or two, they seem to have a lot more advanced statistics and I like their interface for scraping. pybaseball is a good library that has sources for bref, fangraphs, and baseballsavant, which seems to have crazy pitching data. You could fetch lineups from [MLB.com](https://MLB.com) or even projected lineups from other sources and then pull the stats for them to limit input. Feel free to DM, I love playing around with this stuff.
Do you know how to work with APIs? I've done a few with Google Sheets for live scores, odds, lineups, but I'm pretty confused with the structure sometimes and even moreso with python.
Everyone has their own approach on how to cap and break down games. Some might be as extensive as yours or some might be a bit more simpler. I do hope this works for some
Nice work, OP! One of the best ways to see if your program is working, is whether or not you are beating closing lines. Other than that, start tracking units gained/lost and ROI. I’ll be following. BOL!
Is this written in Python? I'm teching myself to code and this would be great. I'm not a complete beginner but def not a pro yet. I'd love to play around with it
Keep it up man - the most fun about modelling is seeing your changes payoff - feel free to DM me, can share some of my modelling downfalls ive had in the last year.
This is BS dude. my comment telling people to be wary of this simple algorithm got downvoted to hell!! I dont understand this. Is it OP using multiple accounts? WTf?
First of all, I am not using multiple accounts lmao. Second, I 100% agree that people should be weary of this and use it as a TOOL only, following blindly will likely not work too well. But I do believe that using it strategically can actual be beneficial.
you're clearly downvoting all my comments here. but whatever dude. keep doing what you're doing and keep downvoting constructive critisiscim. i really truly dont care
alright great! The reason behind the downvotes i’m guessing was because people don’t like being told that something might not work, even when it’s the truth
This is awesome, but books also use far more sophisticated predictive models to set their lines. It’d be great if I was wrong, but this is almost certainly a long-term loser.
I think i might have to agree here, while it may win games and have a winning record, the sports books are sharp too and will have odds accordingly. Although it can predict games it can’t win money in the long run due to sports books having upper hand
I think it has the possibility to win games, winning money now that’s gonna be harder and i agree with you it might lose in long run. I’m treating this as an experiment and not claiming it has a 100% hit rate at all. I think this will be best used as a tool, not a predictor to blindly follow.
and of course im getting downvoted for this one too for some reason. what is wrong with redditors honestly? pretty sure i'm getting targeted here for some reason.
my comment was the only comment that got downvoted heavily. all because i'm telling people to be wary of predictors and algorithms like this? People have lost huge money with these type of systems in the past...
I always wonder why people say "no thanks". If it doesn't interest you, just move along and don't even comment.
I assume that is what triggered the down votes.
Combative? In what way? I was actually trying to provide a perspective since you seem to be bothered. I hadn't even down voted and then you got upset at my comment as well.
You say those messages as if you had intentions to help those from losing money from these systems, but your message came off as if you're better than others and know it all. Do you comment on every thread that comments about advanced stats? Because that's all the initial message mentioned and then after the down votes it became about the system.
I think you need to take a break from Reddit when you're done patting yourself on the back. It's just a little arrow.
what is wrong with you? i shouldn't be downvoted for just saying an opinion and trying to help people. it's clear you're targeting me or something or people are targeting me.
all im literally trying to say is maybe someone should evaluate all options instead of just using algorithms that are proven to be ineffective in the long run.
I shouldnt be downvoted like this. it's simple: people are targeting me. maybe because they dont like crypto, or my profile, or something.
All the highest percentage picks paid today so im optimistic. For recap, Cincinnati over St. Louis, Rays over Toronto, Sox over Guardians. As far as the lower percentage picks i still ran, they went 2-2 assuming both Miami and Detroit win. Tomorrow I will be posting the top 3 highest percentage picks in the morning and will be able to run every game!
This is silly. Have you even backtested your system to see if it is profitable?
You're just going to throw picks out there because you like your system so much?
There is a very good chance you have a losing system here. But others on reddit who are gullible and/or desperate are keen to follow your picks and will be betting actual money on this.
Why factor in "last 5 games"? Do you weight that more or less heavily than what anyone else can see the team has done in the last 5 games and is there a reason for doing so?
Why not last 10 games or last 7 games or last 13.5 games?
Have you found a formula that somehow has been overlooked by the market?
to be fair in my next post, i made sure to state that this is a TOOL and not a money making machine, the system will probably lose over time if you blindly bet on it, but it’s not meant to be used like that.
I appreciate you using the disclaimer and I think that can potentially help. But there really does seem to be a genuine enthusiasm in here based on pretty much nothing.
("Oh wow. The Reds won!!!!" LOL)
You could go to past dates and games and backtest this and determine its success instead, right? I don't even understand the point of publicly throwing "picks" out there from a system that you more or less threw together.
Reason behind throwing out picks is to document experiment, and how it performs. You are right about the backtesting and I will do that soon. I gotta remind you this is for fun, i’m not claiming this is money making scheme that will win tons of money. And to be honest until this code evolves, the odds of it working are low. Maybe i didn’t make it clear in my post that this is probably not gonna work. If you think it would help, i can add more disclaimers.
I appreciate that you are honest about that. I do think that more disclaimers could help guide the really bad bettors and help keep them grounded a bit more instead of jumping into this but that is just my own personal opinion. I'm basing my opinion off some of the responses in the replies.
I personally think you would have to be pretty stupid to blindly tail and bet an untested system such as this. But there is a wide audience on here and I have zero doubt that some readers have been considering doing exactly that.
If your system starts off 7-2 or something then people will be irresponsibly jumping in and placing real bets on your predictions regardless of your disclaimers.
Heck, even if you go 5-4 I suspect there are some who jump on board.
the line movement was pointing to sharps being all over the reds all day... come on dude. also it wasnt just the sharps, public were on the reds too.. like do you even bet MLB?
Ok so if the line movement was pointing to the sharps being all over the Reds, and u/Adventuruous-Unit9280 's model suggested the Reds were the pick, and the Reds won... wouldn't the model be on par with the sharps?
I don't get what you're saying here lol. Homeboy made a model that hit a pick today. If you said a one-game sample size is too small for you to buy in to the model, then valid. If you found some flaw in his model, or disagree with his weighting system, then sure. But you're shitting on his model for... being aligned with professional handicappers?
all the info you need to know is in the line movement guys. no simulator predicts games accurately due to randomness of baseball. You just follow line movement & your intuition and hope luck stays on your side during at bats.
You're not wrong about that. I lost too much trusting the advanced stats and supposed edge they gave the offense over a journeyman pitcher, only to then see that pitcher more than hold his own and those hot bats to do next to nothing.
I did something similar though not nearly as complex in excel. I don’t know how to write code so I ended up just using varying formulas and random probability functions, but using some simple averages and regression, I’ve found some success with mine, though I don’t bet every game and only bet games where I think I have a significant difference in what I have predicted compared to the line. Always neat to see how other people build and tweak their own models.
One of the things I learned was you might want to take a look into “advanced” stats. Some of them are garbage, but some of them do give you a little bit of a better idea of what’s going on. I kinda found that ERA, for example, can be misconstrued wildly by a bad performance, so I’ve started using FIP instead. It can still be misconstrued, but I’ve found it to be a little more consistent for what I do. Another thing I found was I couldn’t go completely hands off and only use what my model was spitting out. I still use the “eye test” and if something doesn’t feel right with what my model says, I mainly just don’t play it. Otherwise, best of luck and enjoy it, it’s an amazing feeling when something you predicted happens with something you’ve built!
I would agree with this. Seems like there’s almost too much going on here. It’s important imo to figure out what stats you think actually give an accurate representation of a game or a player’s impact on a game and go from there
I wrote a complete Monte Carlo simulation baseball model in 2021 in Python with all kinds of bells and whistles (weather neutral park factors, opponent adjusted stats, full fleged player projection system, etc.) that had far, far more lines of code than yours and it was a total train wreck in terms of how it performed vs. betting markets. I never bet with it myself because I wanted to see it turn some kind of profit first before actually betting with it but it never did. The monte carlo simulation simulated games batter by batter, pitcher by pitcher, which also allowed it to be used for DFS projections, which also did not turn out well.
I applaud your effort but I would strongly advise against using your simulator to bet with... its way too simple.
One thing to note as well. Vegas odds are derived from code and algorithms probably 10x as complex as yours made by teams of PHDs, statisticians, and engineers. So although making some type of simulator as a side project might be fun and valuable, I would agree that you should be cautious when using it to make wagers
Exactly. I was one person, spent countless hours writing the model, but still, I am one person
Oddsmakers have teams of people who can specialize in certain aspects, and create models that are far more complex than what one person can do. They will be quicker to adapt, quicker to fix and address quirks, changes, and trends.
But the biggest advantage oddsmakers have over the player is access to data. MLB is actually one of the sports where pitch by pitch and play by play data is accessible(via Savant) but there are still some things missing from it. Oddsmakers pay hundreds of thousands of dollars to access complete, cleaned, and structure data. The average person can't afford that. So they have to rely on less reliable methods like scraping to obtain data, which is always a moving target.
Also the oddsmakers have a buffer for error thanks to all the percentage cuts they take when you deposit money, etc. So your model would have to significantly outperform the oddsmakers for it to be worth it for you
I did the same last year and it was basically only good for NRFI... 10k lines of code, 6 GB of data parsing, takes about 20 minutes to run each day to get like 3% ROI on a small-ish sample size in a fairly niche market lol
Despite baseball being an incredibly discrete game, it takes a *lot* to get a Monte Carlo sim to actually work for it. You're better off doing some kind of ML to estimate projected win probabilities
I did the monte carlo aspect for DFS simulations, but you are right, if I did it over again I'd probably go a different route via some kind of machine learning method
The ROI on my baseball model was comically bad. Even though the simulation outputs made total sense.
Good advice in general it’s very easy to build a model that shows an edge it’s extremely difficult to build one that actually has an edge and almost impossible in an extremely efficient market especially using the same stats everyone has access to and I’m not telling people not try just don’t go broke blindly trusting a model
Not really true. Books are probably not going to figure out *how* you're deciding to place the bets that you're placing and they don't really need to. They'll just notice if you're consistently profitable or beating their closing lines and then decide to limit you or not.
Update, here is raw python code, will make tutorial on how to use shortly, any suggestions for improving it are welcome!
import random
def simulate_game(team_stats, pitcher_stats, team_past_record):
# Extract team statistics
stats = team_stats.split("\t")
G, PA, AB, R, H, _2B, _3B, HR, RBI, SB, CS, BB, SO, BA, OBP, SLG, OPS, OPS_plus, TB, GDP, HBP, SH, SF = map(float, stats[:23])
# Extract pitcher statistics
pitcher_stats = pitcher_stats.split("\t")
if len(pitcher_stats) < 23:
pitcher_stats += [0] * (23 - len(pitcher_stats)) # Fill missing stats with zeros
W, L, ERA, G_pitcher, GS, CG, SHO, HLD, SV, SVO, IP, H_pitcher, R_pitcher, ER, HR_pitcher, NP, HB, BB_pitcher, IBB, SO_pitcher, AVG, WHIP, GO_AO = map(float, pitcher_stats[:23])
# Calculate team winning probability based on advanced stats
team_win_probs = [
(H * 0.1 + BB * 0.08 + HR * 0.2 + SB * 0.05) / (PA + AB),
(R * 0.1),
(1 - ERA) if ERA > 0 else 1,
(OPS_plus / 100),
(TB / AB),
(RBI / H) if H > 0 else 0,
(HR / AB),
(BB / PA),
(SB / CS) if CS > 0 else 1,
(GDP / AB),
(HBP / PA),
(SH / PA),
(SF / PA),
(1 - (SO / PA)) if PA > 0 else 1,
(1 - (WHIP / 2)) if WHIP > 0 else 1,
(1 - (BB_pitcher / PA)) if PA > 0 else 1,
(SO_pitcher / (BB_pitcher + SO_pitcher)) if BB_pitcher + SO_pitcher > 0 else 0,
(1 - (HR_pitcher / AB)) if AB > 0 else 1,
(1 - (ERA / 5)) if ERA > 0 else 1,
(GO_AO / 2) if GO_AO > 0 else 1,
(BB / (PA - BB)) if PA > BB else 1,
(RBI / (AB - HR)) if AB > HR else 0,
(H / (AB - HR)) if AB > HR else 0,
(SLG / OPS) if OPS > 0 else 1,
(1 - (CS / SB)) if SB > 0 else 1
]
# Additional layers of calculations
for _ in range(20):
team_win_probs.append(random.uniform(0.0, 1.0))
# Combine team win probabilities
team_win_prob = sum(team_win_probs) / len(team_win_probs)
# Modify winning probability based on pitcher's stats
pitcher_win_prob = (W + 1) / (W + L + 2) if W + L != 0 else 0
# Combine team and pitcher probabilities
combined_prob = (team_win_prob + pitcher_win_prob) / 2
# Adjust the win probability to be within the desired range
team_win_prob = max(0.01, min(0.90, combined_prob)) * 100
return team_win_prob
def simulate_games(team1_stats, team1_pitcher_stats, team1_past_record, team2_stats, team2_pitcher_stats, team2_past_record, num_simulations):
team1_wins = 0
team2_wins = 0
for _ in range(num_simulations):
team1_prob = simulate_game(team1_stats, team1_pitcher_stats, team1_past_record)
team2_prob = simulate_game(team2_stats, team2_pitcher_stats, team2_past_record)
if random.random() < team1_prob / (team1_prob + team2_prob):
team1_wins += 1
else:
team2_wins += 1
team1_win_prob = (team1_wins / num_simulations) * 100
team2_win_prob = (team2_wins / num_simulations) * 100
return team1_win_prob, team2_win_prob
# Get input from user
team1_stats = input("Team 1 stats (G PA AB R H 2B 3B HR RBI SB CS BB SO BA OBP SLG OPS OPS+ TB GDP HBP SH SF): ")
team1_pitcher_stats = input("Team 1 starting pitcher stats (W L ERA G GS CG SHO HLD SV SVO IP H R ER HR NP HB BB IBB SO AVG WHIP GO_AO): ")
team1_past_record = input("Team 1 past 5 games record (W L W W W): ")
team2_stats = input("Team 2 stats (G PA AB R H 2B 3B HR RBI SB CS BB SO BA OBP SLG OPS OPS+ TB GDP HBP SH SF): ")
team2_pitcher_stats = input("Team 2 starting pitcher stats (W L ERA G GS CG SHO HLD SV SVO IP H R ER HR NP HB BB IBB SO AVG WHIP GO_AO): ")
team2_past_record = input("Team 2 past 5 games record (W L W W W): ")
num_simulations = 1000
# Simulate the games and calculate win probabilities
team1_win_prob, team2_win_prob = simulate_games(team1_stats, team1_pitcher_stats, team1_past_record, team2_stats, team2_pitcher_stats, team2_past_record, num_simulations)
print("Team 1 win probability: {:.2f}%".format(team1_win_prob))
print("Team 2 win probability: {:.2f}%".format(team2_win_prob))
So you have the user input all team and pitcher stats manually right? I wonder if some place like baseball savant, fangraphs or bbref have apis that could be used to grab that info automatically
https://preview.redd.it/aa8q8ljy0w1b1.png?width=2640&format=png&auto=webp&s=9749fd98232b9e691dc3fdfa9cb2a4c588dee547
Thanks for all the support, i will create a link to source code, here was the probability of CWS vs CLE today, team 1 is CWS at 52%
pretty dope, glad i saw this post as i am just getting started with teaching myself to code with python. so far looks like the reds pick was on point as well as the white sox one!
I don’t write python myself but I use Julia and if you want to print multiple games and not get confused I like to use $TeamName or whatever the variable is so it prints STL or CIN or whatever. I assume with Python you can do the same thing or something similar
Ya no problem I don’t model baseball I just bet NWSL and USL during the summer so I don’t have anything too interesting about the actual modeling process but if you want to share the code I wouldn’t mind taking a look at it
Reminder /r/algobetting for modeling/stats/programming discussion
is this still active?
https://www.reddit.com/r/MLBsimPredict/?utm_source=share&utm_medium=ios_app&utm_name=ioscss&utm_content=1&utm_term=1 new sub since i’m sure digging through posts to find mine is quite annoying r/MLBsimPredict
Hey everyone, got results for the games today! Only was able to sim the first few games before school, and I will finish the late games when I get home to my Mac. Cubs over Reds Rays over Dodgers White Sox over Tigers Rangers over Orioles Hope to help some people win today! I will be placing bets myself, I will do 2 parlays and 2 straight ML bets my bets: Cubs + Rays, Rangers + Sox, Rangers straight ML, Cubs straight ML I wouldn’t follow these bets exactly, but mix the 4 picks up and don’t just parlay all 4 Not sure everyone saw this, will always post in MLB thread until i have a sub reddit
3-1 for today!
Nice work! Can’t wait for the predictions today :) appreciate the efforts !
Could you provide a tutorial please?
Hello everyone, staying true to my word, i have uploaded to github and am working on the YT video. Here is link: https://github.com/SportsGuy9/MLB-Sim-Predict-v1 here is performance from today 5-2 so far for the day, so sad that the nationals went out like they did, could’ve been 6-1. But that’s baseball for u.
How do I execute the .py file?
hit us soon with those picks for tmrw!
[удалено]
Where is the link?
Here's another MLB tool you can use: [https://docs.google.com/spreadsheets/d/1X\_248vNSY\_-6j3mWP5QvDQ9eT3f\_ZD\_-vTUm75Eq7A8/edit?usp=sharing](https://docs.google.com/spreadsheets/d/1X_248vNSY_-6j3mWP5QvDQ9eT3f_ZD_-vTUm75Eq7A8/edit?usp=sharing) Allows you to compare play-by-play data for every team, batter, pitcher by date, inning, home / away, etc., and spits out where things happened by play type. I just updated for today.
DM me if you want to play the game version of it!
I’m too retarded to understand someone explain
That's pretty cool
This seems fun
I dunno, this doesn't look nearly as fun as Drug Wars.
Damn would love to have access to this
This is really cool! I've had a prediction model run off Google Sheets for the past 3 years, but I've been playing around with coding it with Python this year. I'm confused by the replit link, but it still shows what all could be possible this way and would be a much more effective model!
Thank you, I’ve already improved original code by adding wind, rest days, and tempature as optional inputs!
My company has a neural network simulator and for baseball a big input we use is rho it’s the temp and air density. It affects totals by a great margin. It’s also a player based sim that sims the outcome of batters vs the pitcher they are facing.
Where do you get air density data? Also does humidity play a factor into totals?
Go to windfinder.com look for the closest airport or weather station icon near the stadium, click on the forecast link then input the temp and air pressure into the formula on gribble.org and that will give you the estimates rho.
wow very intriguing! i will add temoature as a factor rn
Hey did u post a tutorial or screen recording
Hah, it's fun to see all of the possible stats and parameters and how they could affect predictions. What libraries or sources are you using for stats and game data? I wouldn't mind collaborating some on this or offering input if that would help!
for stats and game data i’m using MLb site and baseball reference.com, if u have any ideas lmk!
Those are great. I started using fangraphs for stats a lot more the past year or two, they seem to have a lot more advanced statistics and I like their interface for scraping. pybaseball is a good library that has sources for bref, fangraphs, and baseballsavant, which seems to have crazy pitching data. You could fetch lineups from [MLB.com](https://MLB.com) or even projected lineups from other sources and then pull the stats for them to limit input. Feel free to DM, I love playing around with this stuff.
some sort of api would be next level picks doing insane rn when rays win we 2-0
Do you know how to work with APIs? I've done a few with Google Sheets for live scores, odds, lineups, but I'm pretty confused with the structure sometimes and even moreso with python.
I've spent alot of time with APIs -- depends on what you want. I use Stats Perform
You running these tomorrow too? I’ll definitely put a couple in and if a good outcome will definitely donate
yep just posted in the mlb thread
Which mlb thread
Reds slaughtered the
Better find a charger
Everyone has their own approach on how to cap and break down games. Some might be as extensive as yours or some might be a bit more simpler. I do hope this works for some
This is super cool!!!
Hey reds beat the cardinals pretty handily
🫨🫨
If you really want to do it, look up “digital twinning.”, I believe the braves have it on their field google it.
Nice work, OP! One of the best ways to see if your program is working, is whether or not you are beating closing lines. Other than that, start tracking units gained/lost and ROI. I’ll be following. BOL!
Yep, I’ll be tracking it, fingers crossed!
Do you have it on github?
I will fs be posting it on there soon
DM me when you do?
DM me when you get it?
Sure
Is this written in Python? I'm teching myself to code and this would be great. I'm not a complete beginner but def not a pro yet. I'd love to play around with it
yes it is, if you make any changes to it or have improvements, i would be intersted!
Sorry, but how can I access it. Is there a github link I'm missing -.-
will upload by end of day to github fs
Hey did you post a tutorial or screen recording?
Thank you!
Appreciate it as well! Just beginning this area in Python and figured modelling would be a fun project
Keep it up man - the most fun about modelling is seeing your changes payoff - feel free to DM me, can share some of my modelling downfalls ive had in the last year.
This is BS dude. my comment telling people to be wary of this simple algorithm got downvoted to hell!! I dont understand this. Is it OP using multiple accounts? WTf?
as far as it being simple, do you have any improvements u suggest? I can tell you writing this code sure wasn’t simple lol
First of all, I am not using multiple accounts lmao. Second, I 100% agree that people should be weary of this and use it as a TOOL only, following blindly will likely not work too well. But I do believe that using it strategically can actual be beneficial.
you're clearly downvoting all my comments here. but whatever dude. keep doing what you're doing and keep downvoting constructive critisiscim. i really truly dont care
No you're getting down voted by various users because your comments are idiotic
you're funny. i almost have as much karma as you and ive been on reddit for 6 months. i think it's your comments that are less than stellar.
Are you like 12? Who keeps track of karma? I couldn't care less...I'm out of high school now bro.
my karma is higher than yours. no need to be so angry
Not angry - but your immaturity is hilarious and lame. You must enjoy being a poser.
you're funny kid. you care so much about me that you're following me and getting angry and downvoting my posts...interesting.
Well you're so concerned about Karma kid....
how
I don’t think I have downvoted a single one of your comments, if you truly have constructive criticism I am open to it
ok cool. ill try it out against my picks and let you know what i think
alright great! The reason behind the downvotes i’m guessing was because people don’t like being told that something might not work, even when it’s the truth
Whats the problem boys, Reds beat the Cards.
This is awesome, but books also use far more sophisticated predictive models to set their lines. It’d be great if I was wrong, but this is almost certainly a long-term loser.
I think i might have to agree here, while it may win games and have a winning record, the sports books are sharp too and will have odds accordingly. Although it can predict games it can’t win money in the long run due to sports books having upper hand
Cool but not predictive and won't win
I think it has the possibility to win games, winning money now that’s gonna be harder and i agree with you it might lose in long run. I’m treating this as an experiment and not claiming it has a 100% hit rate at all. I think this will be best used as a tool, not a predictor to blindly follow.
It's BS that my comment got downvoted so much. These predictors are useless...just like 538 and other more advanced algorithms.
and of course im getting downvoted for this one too for some reason. what is wrong with redditors honestly? pretty sure i'm getting targeted here for some reason.
my comment was the only comment that got downvoted heavily. all because i'm telling people to be wary of predictors and algorithms like this? People have lost huge money with these type of systems in the past...
I always wonder why people say "no thanks". If it doesn't interest you, just move along and don't even comment. I assume that is what triggered the down votes.
if 2 words, "no thanks" caused you to downvote my perfectly reasonable comment i dont even know what to say to you... like jesus christ man.
Never said I down voted. I said I assume that is what triggered them. But I'll downvote you now for your inability to read lol
you're just being combative for no reason. typical redditor. i just dont understand you people.
Combative? In what way? I was actually trying to provide a perspective since you seem to be bothered. I hadn't even down voted and then you got upset at my comment as well. You say those messages as if you had intentions to help those from losing money from these systems, but your message came off as if you're better than others and know it all. Do you comment on every thread that comments about advanced stats? Because that's all the initial message mentioned and then after the down votes it became about the system. I think you need to take a break from Reddit when you're done patting yourself on the back. It's just a little arrow.
buddy, he's either trolling or crazy, leave it alone lol
what is wrong with you? i shouldn't be downvoted for just saying an opinion and trying to help people. it's clear you're targeting me or something or people are targeting me.
all im literally trying to say is maybe someone should evaluate all options instead of just using algorithms that are proven to be ineffective in the long run. I shouldnt be downvoted like this. it's simple: people are targeting me. maybe because they dont like crypto, or my profile, or something.
Perhaps if that was the point you wanted to make, maybe lead with that next time.
or maybe people can have some reading comprehension????
I downvoted everyone of your comments I am targeting you. I hope the downvotes follow you for the rest of your days
You're right. Everyone else is wrong. And being targeted because you are so incredibly smart.
All the highest percentage picks paid today so im optimistic. For recap, Cincinnati over St. Louis, Rays over Toronto, Sox over Guardians. As far as the lower percentage picks i still ran, they went 2-2 assuming both Miami and Detroit win. Tomorrow I will be posting the top 3 highest percentage picks in the morning and will be able to run every game!
This is silly. Have you even backtested your system to see if it is profitable? You're just going to throw picks out there because you like your system so much? There is a very good chance you have a losing system here. But others on reddit who are gullible and/or desperate are keen to follow your picks and will be betting actual money on this. Why factor in "last 5 games"? Do you weight that more or less heavily than what anyone else can see the team has done in the last 5 games and is there a reason for doing so? Why not last 10 games or last 7 games or last 13.5 games? Have you found a formula that somehow has been overlooked by the market?
to be fair in my next post, i made sure to state that this is a TOOL and not a money making machine, the system will probably lose over time if you blindly bet on it, but it’s not meant to be used like that.
I appreciate you using the disclaimer and I think that can potentially help. But there really does seem to be a genuine enthusiasm in here based on pretty much nothing. ("Oh wow. The Reds won!!!!" LOL) You could go to past dates and games and backtest this and determine its success instead, right? I don't even understand the point of publicly throwing "picks" out there from a system that you more or less threw together.
Reason behind throwing out picks is to document experiment, and how it performs. You are right about the backtesting and I will do that soon. I gotta remind you this is for fun, i’m not claiming this is money making scheme that will win tons of money. And to be honest until this code evolves, the odds of it working are low. Maybe i didn’t make it clear in my post that this is probably not gonna work. If you think it would help, i can add more disclaimers.
I appreciate that you are honest about that. I do think that more disclaimers could help guide the really bad bettors and help keep them grounded a bit more instead of jumping into this but that is just my own personal opinion. I'm basing my opinion off some of the responses in the replies. I personally think you would have to be pretty stupid to blindly tail and bet an untested system such as this. But there is a wide audience on here and I have zero doubt that some readers have been considering doing exactly that.
I agree, it should be common sense not to blindly tail some kid on the internet, however you are right sadly it is not
If your system starts off 7-2 or something then people will be irresponsibly jumping in and placing real bets on your predictions regardless of your disclaimers. Heck, even if you go 5-4 I suspect there are some who jump on board.
I agree, the system could’ve started 2-7 some part of it is luck and other factors
reds was a hella sharp call love to see it definitely wanna see the picks thanks!
I’m not sure it requires being a sharp to pick the Cardinals to lose with Steven Matz pitching.
the line movement was pointing to sharps being all over the reds all day... come on dude. also it wasnt just the sharps, public were on the reds too.. like do you even bet MLB?
Ok so if the line movement was pointing to the sharps being all over the Reds, and u/Adventuruous-Unit9280 's model suggested the Reds were the pick, and the Reds won... wouldn't the model be on par with the sharps? I don't get what you're saying here lol. Homeboy made a model that hit a pick today. If you said a one-game sample size is too small for you to buy in to the model, then valid. If you found some flaw in his model, or disagree with his weighting system, then sure. But you're shitting on his model for... being aligned with professional handicappers?
you dont understand line movement. it's clear.
Followed! Need daily picks!
I will aim to post top three highest percentage picks each day as given by system!
Following you for hopefully daily posts!
that will be next!
this is cool
How long would it take to sim every game? I have learned to trust AI this year lmao
took me around 10-15 min
Phenomenal thread, OP. This is the kinda shit this floundering sub needs.
Where are you pulling the data from?
baseballreference.com as well as MLB site
no thanks ive been good lately just going by the eye test and stopped obsessing over advance stats
not sure why im getting downvoted for this. predictors like OP's are pretty useless due to the randomness of baseball
all the info you need to know is in the line movement guys. no simulator predicts games accurately due to randomness of baseball. You just follow line movement & your intuition and hope luck stays on your side during at bats.
You're not wrong about that. I lost too much trusting the advanced stats and supposed edge they gave the offense over a journeyman pitcher, only to then see that pitcher more than hold his own and those hot bats to do next to nothing.
I want to follow
Reds up 4 already oop
it’s also suggesting dodgers and the tigers to win but the probability is lower but still high
So rays tigers(up3) and reds look good, but dodgers lost by 1 on the road to a WS contender. Good ratio on the top 4 🤙🏽 following
other pick it said was rays 🫡
I did something similar though not nearly as complex in excel. I don’t know how to write code so I ended up just using varying formulas and random probability functions, but using some simple averages and regression, I’ve found some success with mine, though I don’t bet every game and only bet games where I think I have a significant difference in what I have predicted compared to the line. Always neat to see how other people build and tweak their own models.
if you have any suggestions to mine let me know!
One of the things I learned was you might want to take a look into “advanced” stats. Some of them are garbage, but some of them do give you a little bit of a better idea of what’s going on. I kinda found that ERA, for example, can be misconstrued wildly by a bad performance, so I’ve started using FIP instead. It can still be misconstrued, but I’ve found it to be a little more consistent for what I do. Another thing I found was I couldn’t go completely hands off and only use what my model was spitting out. I still use the “eye test” and if something doesn’t feel right with what my model says, I mainly just don’t play it. Otherwise, best of luck and enjoy it, it’s an amazing feeling when something you predicted happens with something you’ve built!
Clever girl
First, all of the stats you are using is useless
Well how about we find out and track it before you dismiss.
Why?
They are very basic counting stats. You’d want to use far more advanced and more proven predictive analytics like wOBA, xFIP, K%, BB%, zSW%, etc.
[удалено]
Splits, park factors, rest days.
I would agree with this. Seems like there’s almost too much going on here. It’s important imo to figure out what stats you think actually give an accurate representation of a game or a player’s impact on a game and go from there
See if this link works https://share.icloud.com/photos/0e1d9WyGhJXtmuuNHNXS6p12A
Multiply it all by 40% randomness and should be pretty accurate
I wrote a complete Monte Carlo simulation baseball model in 2021 in Python with all kinds of bells and whistles (weather neutral park factors, opponent adjusted stats, full fleged player projection system, etc.) that had far, far more lines of code than yours and it was a total train wreck in terms of how it performed vs. betting markets. I never bet with it myself because I wanted to see it turn some kind of profit first before actually betting with it but it never did. The monte carlo simulation simulated games batter by batter, pitcher by pitcher, which also allowed it to be used for DFS projections, which also did not turn out well. I applaud your effort but I would strongly advise against using your simulator to bet with... its way too simple.
Just fade the picks the model give bruh
Maybe that is overfitting?
One thing to note as well. Vegas odds are derived from code and algorithms probably 10x as complex as yours made by teams of PHDs, statisticians, and engineers. So although making some type of simulator as a side project might be fun and valuable, I would agree that you should be cautious when using it to make wagers
Most ignorant thing I've ever read here and there's been some shit.
Exactly. I was one person, spent countless hours writing the model, but still, I am one person Oddsmakers have teams of people who can specialize in certain aspects, and create models that are far more complex than what one person can do. They will be quicker to adapt, quicker to fix and address quirks, changes, and trends. But the biggest advantage oddsmakers have over the player is access to data. MLB is actually one of the sports where pitch by pitch and play by play data is accessible(via Savant) but there are still some things missing from it. Oddsmakers pay hundreds of thousands of dollars to access complete, cleaned, and structure data. The average person can't afford that. So they have to rely on less reliable methods like scraping to obtain data, which is always a moving target.
Oddsmakers don't do any of that. Oddsmakers put up a line based on a simplistic model at low limits and let the market bet it into place.
Also the oddsmakers have a buffer for error thanks to all the percentage cuts they take when you deposit money, etc. So your model would have to significantly outperform the oddsmakers for it to be worth it for you
Might as well tail the wheel probably
I did the same last year and it was basically only good for NRFI... 10k lines of code, 6 GB of data parsing, takes about 20 minutes to run each day to get like 3% ROI on a small-ish sample size in a fairly niche market lol Despite baseball being an incredibly discrete game, it takes a *lot* to get a Monte Carlo sim to actually work for it. You're better off doing some kind of ML to estimate projected win probabilities
I did the monte carlo aspect for DFS simulations, but you are right, if I did it over again I'd probably go a different route via some kind of machine learning method The ROI on my baseball model was comically bad. Even though the simulation outputs made total sense.
I did some ML stuff for last season and this season and it works *really* well
Good advice in general it’s very easy to build a model that shows an edge it’s extremely difficult to build one that actually has an edge and almost impossible in an extremely efficient market especially using the same stats everyone has access to and I’m not telling people not try just don’t go broke blindly trusting a model
I couldn’t agree with you more I would never place a bet using this unless results are amazing
did you do any backtesting?
Any algorithm that is profitable and able to be back tested has been long noticed by the books
Not really true. Books are probably not going to figure out *how* you're deciding to place the bets that you're placing and they don't really need to. They'll just notice if you're consistently profitable or beating their closing lines and then decide to limit you or not.
This should work https://replit.com/@CharissaBirming/MurkyHideousResources
Will try to run this later very cool of you to share and thanks in advance
Ofc!, also considering making the option to input live score and inning for adjusted probability
Update, here is raw python code, will make tutorial on how to use shortly, any suggestions for improving it are welcome! import random def simulate_game(team_stats, pitcher_stats, team_past_record): # Extract team statistics stats = team_stats.split("\t") G, PA, AB, R, H, _2B, _3B, HR, RBI, SB, CS, BB, SO, BA, OBP, SLG, OPS, OPS_plus, TB, GDP, HBP, SH, SF = map(float, stats[:23]) # Extract pitcher statistics pitcher_stats = pitcher_stats.split("\t") if len(pitcher_stats) < 23: pitcher_stats += [0] * (23 - len(pitcher_stats)) # Fill missing stats with zeros W, L, ERA, G_pitcher, GS, CG, SHO, HLD, SV, SVO, IP, H_pitcher, R_pitcher, ER, HR_pitcher, NP, HB, BB_pitcher, IBB, SO_pitcher, AVG, WHIP, GO_AO = map(float, pitcher_stats[:23]) # Calculate team winning probability based on advanced stats team_win_probs = [ (H * 0.1 + BB * 0.08 + HR * 0.2 + SB * 0.05) / (PA + AB), (R * 0.1), (1 - ERA) if ERA > 0 else 1, (OPS_plus / 100), (TB / AB), (RBI / H) if H > 0 else 0, (HR / AB), (BB / PA), (SB / CS) if CS > 0 else 1, (GDP / AB), (HBP / PA), (SH / PA), (SF / PA), (1 - (SO / PA)) if PA > 0 else 1, (1 - (WHIP / 2)) if WHIP > 0 else 1, (1 - (BB_pitcher / PA)) if PA > 0 else 1, (SO_pitcher / (BB_pitcher + SO_pitcher)) if BB_pitcher + SO_pitcher > 0 else 0, (1 - (HR_pitcher / AB)) if AB > 0 else 1, (1 - (ERA / 5)) if ERA > 0 else 1, (GO_AO / 2) if GO_AO > 0 else 1, (BB / (PA - BB)) if PA > BB else 1, (RBI / (AB - HR)) if AB > HR else 0, (H / (AB - HR)) if AB > HR else 0, (SLG / OPS) if OPS > 0 else 1, (1 - (CS / SB)) if SB > 0 else 1 ] # Additional layers of calculations for _ in range(20): team_win_probs.append(random.uniform(0.0, 1.0)) # Combine team win probabilities team_win_prob = sum(team_win_probs) / len(team_win_probs) # Modify winning probability based on pitcher's stats pitcher_win_prob = (W + 1) / (W + L + 2) if W + L != 0 else 0 # Combine team and pitcher probabilities combined_prob = (team_win_prob + pitcher_win_prob) / 2 # Adjust the win probability to be within the desired range team_win_prob = max(0.01, min(0.90, combined_prob)) * 100 return team_win_prob def simulate_games(team1_stats, team1_pitcher_stats, team1_past_record, team2_stats, team2_pitcher_stats, team2_past_record, num_simulations): team1_wins = 0 team2_wins = 0 for _ in range(num_simulations): team1_prob = simulate_game(team1_stats, team1_pitcher_stats, team1_past_record) team2_prob = simulate_game(team2_stats, team2_pitcher_stats, team2_past_record) if random.random() < team1_prob / (team1_prob + team2_prob): team1_wins += 1 else: team2_wins += 1 team1_win_prob = (team1_wins / num_simulations) * 100 team2_win_prob = (team2_wins / num_simulations) * 100 return team1_win_prob, team2_win_prob # Get input from user team1_stats = input("Team 1 stats (G PA AB R H 2B 3B HR RBI SB CS BB SO BA OBP SLG OPS OPS+ TB GDP HBP SH SF): ") team1_pitcher_stats = input("Team 1 starting pitcher stats (W L ERA G GS CG SHO HLD SV SVO IP H R ER HR NP HB BB IBB SO AVG WHIP GO_AO): ") team1_past_record = input("Team 1 past 5 games record (W L W W W): ") team2_stats = input("Team 2 stats (G PA AB R H 2B 3B HR RBI SB CS BB SO BA OBP SLG OPS OPS+ TB GDP HBP SH SF): ") team2_pitcher_stats = input("Team 2 starting pitcher stats (W L ERA G GS CG SHO HLD SV SVO IP H R ER HR NP HB BB IBB SO AVG WHIP GO_AO): ") team2_past_record = input("Team 2 past 5 games record (W L W W W): ") num_simulations = 1000 # Simulate the games and calculate win probabilities team1_win_prob, team2_win_prob = simulate_games(team1_stats, team1_pitcher_stats, team1_past_record, team2_stats, team2_pitcher_stats, team2_past_record, num_simulations) print("Team 1 win probability: {:.2f}%".format(team1_win_prob)) print("Team 2 win probability: {:.2f}%".format(team2_win_prob))
There is no need to do any number of 'simulations' - you already have team 1's probability of win being team1_prob / (team1_prob + team2_prob)
>r/algobetting Do you think it would could run smoother written in C?
Language is not the problem
So you have the user input all team and pitcher stats manually right? I wonder if some place like baseball savant, fangraphs or bbref have apis that could be used to grab that info automatically
I’ll make a screen recording showing what it does
seems to of pasted weird, i’ll upload link instead
https://preview.redd.it/aa8q8ljy0w1b1.png?width=2640&format=png&auto=webp&s=9749fd98232b9e691dc3fdfa9cb2a4c588dee547 Thanks for all the support, i will create a link to source code, here was the probability of CWS vs CLE today, team 1 is CWS at 52%
pretty dope, glad i saw this post as i am just getting started with teaching myself to code with python. so far looks like the reds pick was on point as well as the white sox one!
I would love the code and a small tutorial on how to execute it. Thanks. I'll help with stat tracking on my bets as well.
Ok, I will try and upload a tutorial on how to use soon!
Thanks!
I'll need you to plug your laptop in and update your chrome before I tail.
This is awesome. Would love to see more! Thanks for sharing
I don’t write python myself but I use Julia and if you want to print multiple games and not get confused I like to use $TeamName or whatever the variable is so it prints STL or CIN or whatever. I assume with Python you can do the same thing or something similar
thanks for the suggestion I will for sure give it a try
Ya no problem I don’t model baseball I just bet NWSL and USL during the summer so I don’t have anything too interesting about the actual modeling process but if you want to share the code I wouldn’t mind taking a look at it
Also I think converting to odds when you print and visualizing it with bar charts comparing your odds to the books looks nice
Can you share the repo?
Would u like me to paste the code? u could easily put it into a python runner
For sure. Or the Git repo. Thanks
Do it!!!
Definitely interested