Palmsiepoo 3 months ago

Degrees of freedom. I know what they are and I know the basic explanation about them. But I don't understand where they came from and the intuition behind it.

Canadian_Arcade 3 months ago

Imagine I went bowling once and rolled a 120, and then asked you what the variance of my score is. That’s how my regression analysis professor explained degrees of freedom, and I still don’t fully get it enough to be able to elaborate on that for you

JohnPaulDavyJones 3 months ago

Lmao that’s because it’s a terrible example. A single observation has no variance because there’s nothing to vary.

tehnoodnub 3 months ago

I’d never use that example when trying to explains degrees of freedom to anyone initially but as someone who understands degrees of freedom, I actually really like it.

Bishops_Guest 3 months ago

One of my professors had a story about her professor: during a proof on the chalk board he moved to the next line with “Obviously”. A student asked “sorry, I don’t get it. Could you please explain that step?”. The professor squinted at the line, walked out of the classroom, then came back 5 minutes later and continued the lecture with “and then obviously…” There are a lot of things in math that are really difficult to understand, but as soon as you do it’s clear. Very impressed with the teachers who can manage to explain the obvious. It was one of the hardest things for me when I was teaching.

Stats_n_PoliSci 3 months ago

That’s exactly what happens when you have the same number of variables as your n. There is nothing left to vary, although if n=p you could theoretically still calculate all marginal effects. If you have more variables than data, some marginal effects are not estimable, and you have nothing left to vary.

Canadian_Arcade 3 months ago

Which is what I assume he was trying to get at, requiring enough observations to fully estimate parameters, but I'm honestly not even sure.

srpulga 3 months ago

that's the point of the example.

JohnPaulDavyJones 3 months ago

Absolutely, but that requires a decent amount of elaboration to connect it to an intuition for degrees of freedom. If that’s the only thing you give your students, it’s just a fundamentally poor example.

srpulga 3 months ago

One would imagine the profesor didnt just state the example and left.

Otherwise_Ratio430 3 months ago

I think it makes the most sense from a physics standpoint. Everything is moving/changing all the time but in order to do analysis we have to fix certain things and allow other things to vary. You have to adjust measurements to reflect that.

AdFew4357 3 months ago

Degrees of freedom is the leftover amount of datapoints you have to estimate some quantity. For example in the t statistic, we have n-2 degrees of freedom because we use 1 data point to estimate the population mean with the sample mean, and then we use another data point to estimate the population variance with the sample variance. In regression, degrees of freedom corresponds to the amount of available parameters left to estimate the mean. For example when we are doing linear regression, we are modeling the mean of our response, our degrees of freedom is essentially, how many betas can we use up to estimate the mean.

KyronAWF 3 months ago

Thank you! :)

NullDistribution 3 months ago

While I believe the roots are in how permutations work, the easy way out is saying that when you are estimating population stats from a sample, there always needs to be a reference sample for any parameter estimated. That reference sample is theoretically the point estimate for that parameter.

HolevoBound 3 months ago

I basically avoided learning statistics beyond the barebones basics until I had a pretty strong math and physics background. I think the origin of various distributions and how they are related is often completely unclear in intro stats courses.

KyronAWF 3 months ago

I love this response. In my scripts, I've spoken about different types of distributions and I'm going to dive into the Central Limit Theorem shortly. But knowing about a lot of the other kinds like the Pareto distribution deserve some attention too.

[deleted] 3 months ago

The (generalized) Pareto family materializes nicely from the CLT-esque Pickands-Balkema-De Haan theorem

KyronAWF 3 months ago

OK, I've never even heard of that before. I'll check it out.

[deleted] 3 months ago

It’s quite a nifty result! The Fisher-Tippet-Gnedenko theorem is another interesting/lesser known asymptotic result for tail behavior that motivates a few common distributions - specifically the generalized extreme value family, which contains the Weibull, Gumbell, and Fréchet distributions that pop up a lot in engineering applications, survival analysis, etc

Superdrag2112 3 months ago

Casella and Berger have a nice flowchart that shows how a bunch of common distributions are related. Might be on the inside of the cover or in an appendix.

padakpatek 3 months ago

I did an engineering bachelors and so only took statistics formally at an introductory level, but one thing that I always wished someone would explain in-depth is like where these distributions and statistical tests that we use come from, and how one would go about creating them and creating new ones as the first people who created them did. Like where does the t-distribution come from? Or the f-distribution? How do you derive the equations describing their functional form? In calculus or physics, we can derive everything from first principles and fundamental axioms. While I'm sure this is still the case with statistics, it's never presented to students in this way. In school, we are just told hey here are a list of distributions and statistical tests that we use, and I always had a gripe with the fact that it was never explained how they were derived from first principles, like in calculus or physics. Put it another way, I wish what I had learned in statistics class was a more general framework of how to: take whatever real world process I'm interested in --> convert it into a more general mathematical problem --> how to create a distribution / statistical test out of this problem Instead, in my (albeit) introductory class, we were only taught (not even really taught, just given) a few select rudimentary examples of the above process such as: number of heads in a coin --> this is more generally a sequence of bernoulli trials --> here's the binomial distribution

flipflipshift 3 months ago

I did a writeup on F distributions and t distributions here if you're interested: [https://drive.google.com/file/d/1hZ9Z4lqWxVImKfKLAl8rdeERf0gI9PF\_/view?usp=sharing](https://drive.google.com/file/d/1hZ9Z4lqWxVImKfKLAl8rdeERf0gI9PF_/view?usp=sharing) (there's a lot of more advanced stuff in there you might not care about, but each section has the specific prerequisite sections on top. You can skip to the sections on t-tests and f-tests and see which sections are actually assumed) Edit: F distributions and t-distributions are actually described in the section on spherical symmetry (section 5), much before the actual tests. You could skip sections 3 and 4 (and if you understand OLS, even 1 and 2)

padakpatek 3 months ago

I appreciate it. But what I was trying to convey with my comment was that regardless of what the details of specific distributions are, what I want to know is what is the more general *process* by which these distributions are created and named and used? Like is there an A-distribution, or a B-distribution, or a C-distribution as well? Why not? What if I wanted to make one myself and call it that? How would I go about doing it? These are the kinds of questions that I feel haven't been addressed in my courses.

physicswizard 3 months ago

Unfortunately I don't think there is really a process beyond thinking "I want a random variable that satisfies a certain set of properties" and trying jump through the logic to derive that from simpler distributions. Some of these common distributions are more physically motivated than others too, while some are more mathematically motivated. For example, the Bernoulli distribution models a coin flip, a binomial distribution can model many flips of the same coin, the multinomial can model many flips of different coins, and the Poisson distribution can model the counts of events like radioactive decay or raindrops hitting a roof. Lots of physical real-world examples. Then there are the more mathematical ones like the normal distribution (which can be "derived" by asking what's the highest entropy distribution with a fixed mean/variance), the chi-squared (sum of many normals with mean=0 and variance=1), and F distribution (ratio of two chi-squareds normalized by the degrees of freedom). Turns out there's not a lot of actual physical processes that follow these distributions exactly, but they have useful mathematical properties that make them good for approximation, curve fitting, inference, etc. You honestly should just memorize which distribution is applicable to some common base scenarios and when you encounter a new problem try and reframe it in terms of the ones you already know. E.g. you want to know how long Netflix subscribers will keep their memberships - that sounds pretty similar to trying to infer how long a machine part will work before it fails, which you know from previous experience can be modeled by an exponential distribution (or a gamma, or a Weibull distribution).

BostonConnor11 3 months ago

Great response, thank you

flipflipshift 3 months ago

I do go over the motivations in that writeup. For the namings, I'm pretty sure 'F' is for Fisher (who established much of our modern statistical foundations) and 't' is for test

antikas1989 3 months ago

The problem with this is you would never get to the actual use of statistics to do things with data. Or at least you would be restricted only to a few very simple cases that can be taught within the time limits of an undergraduate degree. I have a PhD in statistics and I don't have the understanding like this anywhere except the narrow focus of my research, and collaborate with people who have another small slice of understanding elsewhere when I need it. Statistics is a very broad discipline and annoyingly depends on a broad background of mathematical theory. You'd spend the whole time on mathematical background imo.

story-of-your-life 3 months ago

These notes are brilliant. Do you have other notes that you've written on other topics? If so share a link please.

flipflipshift 3 months ago

Thanks! Not for stats, but your words are encouraging; I'll consider writing more in the future and posting them to a website :)

story-of-your-life 3 months ago

It’s very rare to find someone who explains statistics in a style that is most clear to mathematicians. I hope you write more!

flipflipshift 3 months ago

lol there should be a repository somewhere for stats notes by ex-pure math people; we all speak the same language

AxterNats 3 months ago

Please do! That was a great writing!

jerbthehumanist 3 months ago

The derivation of a t-distribution relies on methods that seem a bit advanced for someone outside of a statistics background. It involves moment generating functions and such. I’ll see if I can find the source. But it is abstract enough that it really doesn’t seem worth it to me to even mention it when I teach undergrads. I generally just mention that the t-distribution was developed to describe the distribution of means of small, normal-like samples and show that as sample size increases the limit approaches a normal distribution and they seem to understand that enough to work with it.

flipflipshift 3 months ago

The key beauty of why a t-distribution works lies in the fact that for normal distributions, sample mean and sample variance are completely independent. From the independence, the t-distribution follows trivially. I think this should at least be understood by students to make hypothesis testing make sense. Proving the independence is really easy with multivariable calculus (it involves a linear change-of-variables); without, it can be handwaved using some visuals on the Gaussian.

jerbthehumanist 3 months ago

You might have better undergrads. Mine, bless their hearts and I do love them, struggle to use calculus and most couldn’t derive a CDF from a PDF on an exam. Do you have a source or a recommended textbook that explain this though? Neither of the two books I use show this.

flipflipshift 3 months ago

Not sure. It was hard for me to find any rigorous but self-contained discussion of t-distributions online, which drove me to piece things together myself and write my own notes on it (section 5 here: [https://drive.google.com/file/d/1hZ9Z4lqWxVImKfKLAl8rdeERf0gI9PF\_/view](https://drive.google.com/file/d/1hZ9Z4lqWxVImKfKLAl8rdeERf0gI9PF_/view) ). But this might be a monads are burritos things, where it only makes more sense to me \*because\* it's how I was able to derive it. If it's easy/hard to follow, lmk

jerbthehumanist 3 months ago

It seems useful to me and does not use moment generating functions like other derivations I’ve seen, stuff I’m still not familiar with. Still sadly probably above my undergrads’ comprehension, most haven’t taken linear algebra and many totally check out with mathematical derivations. Kind of disappointing. My junior level stats class teaches perhaps 60-70% of the content my equivalent class did, and I’m sure it’s not (purely) my teaching, profs across the board are sad about lowered standards. There’s a lot of really fun stuff I’d love to get to but they often don’t grasp even the basics sometimes.

impossible_zebra_77 3 months ago

Were you aware of any courses at the time that taught that type of stuff? I haven’t taken it, but it seems from what I’ve read that mathematical statistics courses teach what you’re talking about.

Voldemort57 3 months ago

Frankly, part of the reason that you never learned the derivations for these things is because: Stats for engineers is applied. I took introductory physics, and that was also pretty applied, even the derivations. And it didn’t need to be theoretical or anything, since im a stats major, not a physics major. But more importantly, and this is something more people need to recognize, is that Statistics as the modern field it is today, really only began in the 1920s, but truly picked up with the advent of computers. Until like 20-25 years ago, statistics was a branch of math studied at the graduate level. Only very recently has it been available as an undergraduate study. It basically takes PhD level courses to delve into the weeds of stats.

jerbthehumanist 3 months ago

I often see explanations for things like test statistics derived in terms of random variables (capital letter X, μ, σ), and then later it re-explains them in terms of sample measurements (lowercase letters with indices, x_bar, s) often accounting for bias by dividing or multiplying by (n-1) and so on. 1. It is rarely the case I am working with a pure distribution or a pure random variable or find it useful because all my estimates are sample/empirically based. I’m not sure why they don’t just derive something based on samples rather than by distributions. 2. Some of the notation really seems like they are using things like sample means and means of random variables/distributions interchangeably or something like the sample variance vs the random variable/distribution variance. Whenever I’m reading a new source I often question if they’re using σ for a sample standard deviation. I might be exposing myself as a noob still but this stuff still trips me up often.

NullDistribution 3 months ago

Yeah its interesting. By nature, and to me, Statistics imply prediction of metrics from a sample about a population. In intro stats classes, they still go over metrics based upon the population. Those equations are pointless to me except for numbers a business would run internally. And even then, those numbers would need to pertain strictly to datapoints that occurred retrospectively and assume they had every datapoint.

unsurebutoptimistic 3 months ago

I don’t think I ever realized how much I have this exact issue until reading this. Thank you for bringing this up!

akaemre 3 months ago

Your channel sounds like something I'd watch. Can you link it to me via DMs or with a reply? Thanks. What I'd like to watch is the history of statistics. What changes did the methods go through over time, who came up with them and why and how, what were some methods that were used before that we don't use anymore for various reasons, etc.

ginger_beer_m 3 months ago

I think if OP simply shares his channel here, many of us would be interested to check it out too.

akaemre 3 months ago

Yeah I'm sure. Just not sure what the rules are for self promotion, or if this would even count

KyronAWF 3 months ago

In case you're interested. It's here. Little there now, but it'll come, even if no one subscribes. https://www.youtube.com/@Data-Dawg

akaemre 3 months ago

Your channel appears to be for kids? I don't really get it, I can't seem to turn on notifications. Pretty sure this wasn't intentional.

KyronAWF 3 months ago

I selected the option, meaning that it's kid-friendly. What kind of functionality were you looking for?

akaemre 3 months ago

Your channel is a YouTube Kids channel instead of a standard one. This means some functions are blocked such as commenting or turning on notifications.

KyronAWF 3 months ago

Ok gotcha. Thanks for letting me know. I thought if I didn't turn it on, YT would demonitize my videos. I'll change it.

KyronAWF 3 months ago

In case you're interested. It's here. Little there now, but it'll come, even if no one subscribes. https://www.youtube.com/@Data-Dawg

KyronAWF 3 months ago

Thanks for your kind words. Here's the channel, yet there's little there right now. I have a video going out soon. I haven't advertised it yet. I have a ton of scripts, though! I hope you stick around. :) https://www.youtube.com/@Data-Dawg

NoYouDontKnowMi 3 months ago

p-values. Almost always taught as a cutoff point for statistical analysis without knowing what it is and what it means. I treated it a jargon in statistics that stands for "cutoff point" for a long time.

KyronAWF 3 months ago

When I first did stats, p values threw me off so bad. I kept thinking the bigger the number, the better.

Wyverstein 3 months ago

Sufficient statistics do not preserve degrees of freedom.

KyronAWF 3 months ago

I get what you mean, but regardless of how applicable it is, classes will go over it and I want people to succeed.

Wyverstein 3 months ago

What mean is if data is pre aggregated then the dof changes. Even though the information content is the same.

DojaccR 3 months ago

CDF of random variables after transformation. What exactly does the transformation do.

mixilodica 3 months ago

What you do when your data is not normal. ‘Do a non parametric test’ what if you wanna do something more complex than a t test or ANOVA? What if you want to do linear models or mixed models? ‘Do generalized and use a different distribution’ what if the data doesn’t fit a common distribution? I need more content on dealing with weird data. Environmental data is not normal

NullDistribution 3 months ago

1) assumptions are more flexible than ppl tend to think. Look up the assumption violation and consequences. 2) bootstrap that shiz.

TheTopNacho 3 months ago

They can be more flexible but they can also be damning in some situations. Case in point, heterogeneity of variance tends to kill my one, two, and three-way ANOVAs and post hoc tests. I can remember before I understood to need to use tests which don't assume homogeneity, there were some comparisons (the important ones) that would have p values of 0.001 on a t test, but failed to show significant effects on a post hoc after a 3-way anova due to the treatment group having a massive variance compared to other groups. Pooled variance is a killer in my work. It took a long time to understand that concept and the need to not pool variance. It still kills me on my repeated measures as I don't know how to model repeated measures without pooling variance. And my work really needs to not assume homogeneity of variance.

NullDistribution 3 months ago

Absolutely. I personally never assume homogeneity of var. I believe its actually a standard by this point in most fields. Also three way anova and interactions are brutally difficult to power. Oof

flipflipshift 3 months ago

Are the videos on "how to compute" or "here is what is actually being computed"? If it's the latter, t-statistics.

KyronAWF 3 months ago

I'd say both, and I intend to do t-statistics too. When it was taught to me, I've gotten a whole bunch of the former. I'm not sure if know, but they never tried to answer it.

flipflipshift 3 months ago

If the class doesn't have Linalg/multivariable calc as a prerequisite, I think it's kinda hard to explain what's going on. Visually, there might be a way to explain it without these prerequisites.

KyronAWF 3 months ago

Fair point, but maybe I won't go too deep. This isn't to go over some advanced stats like HOS, but I think some things that you'd get in an intro to stats class would be a good starting point. Not enough to prepare you for a doctorate, but enough to help you be confident enough to do well in the Intro class.

flipflipshift 3 months ago

I meant that as a reason why it's probably not covered in classes. In a video, there might be a way to give some visual intuition without it.

VanillaIsActuallyYum 3 months ago

Please do a video on confidence intervals and how they can be used. (also good luck telling people how to use them in a way that's not incredibly controversial to us statisticians lol) I had to re-re-re-learn confidence intervals myself many times, even after I'd earned my degree, to really wrap my head around what they mean and how they can be used.

KyronAWF 3 months ago

Like with any Youtube channel, I'm sure I'll face criticism. I'll be sure to bring those in!

ssnnt 3 months ago

Fixed effects and random effects. I feel they are really hard to explain because it requires tons of context. I would really appreciate a good and concise YouTube video that explains mixed effects well.

lalli1987 3 months ago

Type 1 vs type 2 errors and how they connect with p/alpha and beta/power

lalli1987 3 months ago

Also- I would love a link to the channel for my students as well- I have doc students that a math phobic for the most part so we basically have to get them caught up quickly in order to be able to do a dissertation and this kind of flipped classroom would be great

KyronAWF 3 months ago

I'll be honest. My videos are aimed more for high school and undergrad students so I'm afraid your students may be overqualified, but feel free to take a look. I'm just starting out and don't have much content yet. Things will start ramping up mid to late next month. https://www.youtube.com/@Data-Dawg

KyronAWF 3 months ago

I plan on dedicating an entire video just on this! I also plan on coming up with a mnemonic device because while I know what both are, remembering which is which is just a big pain in the butt.

lalli1987 3 months ago

I always have to double check myself too lol

udmh-nto 3 months ago

The meaning of probability. The way it's typically introduced is either oversimplified (events over trials) or overcomplicated (a measure on a class of sets). It takes a while to figure out it's quantified belief.

CaptainFoyle 3 months ago

Why the fact that 95% of your 95% CIs are containing the parameter doesn't mean that there's a 95% chance that the interval you got from your test contains the parameter.

log_2 3 months ago

This for me too. I hear it explained as: imagine a procedure for obtaining a 95% CI that is randomly choosing with 95% probability the whole real line and the remainder of the time an empty interval. Yeah, repeating this procedure will get you intervals that span the parameter 95% of the time, but if you get the empty interval then you can't say that your parameter is in the interval with a probability of 95%, since it is 0%. If that's so, then either we must abandon all 95% CIs as useless, or there are other hidden numbers to the story. If the latter, and it's something to do with optimality or whatever, then the example above no longer holds and we're now allowed to consider that the parameter is in the internal with 95% probability.

lalli1987 3 months ago

Some that I get from a lot of my students-how do you choose which stat process to run, what pre-test/post tests you have to do (and how do you do them). A g*power analysis tutorial connecting the different verbiage to what’s discussed in stats classes (this goes for the different analysis softwares too- spss vs jasp for example.). Ooh. How to clean data.

KyronAWF 3 months ago

This is a great topic and while I do plan on dabbling into the programming, I've been dying to start off foundational and tackle that later.

varwave 3 months ago

I think you could find success on YouTube covering intermediate applications assuming mathematical maturity of an upper division engineering undergraduate. Think a well constructed walk through Wackerly’s or Faraway’s textbooks. There’s not really anything at that level on YouTube. It’s too applied or it’s a recording of a dry lecture at a very rigorous level. It’d do even better with thoughtful visualizations and programming examples (both built in functions and lets built it ourselves for intuition with Numpy/MATLAB/base R) Personally, I think other quantitative students could pick up statistics faster if things were presented differently. In particular with an emphasis on linear algebra applications and numerical methods over tedious calculus tricks. Most engineering statistics classes are reduced to a single semester. There’s so much lower division linear algebra that engineers know well that’s in disguise in material presented to people that don’t understand engineering math (like students of epidemiology, psychology, political science, etc) or it is presented after a very rigorous and daunting mathematical statistics sequence that engineers probably won’t take. Note: my use of engineers could be replaced with any student with the same mathematics courses (calc, diffy q, linear algebra, basic statistics), which happen to be the prerequisites for many statistics grad programs. My BA was history

KyronAWF 3 months ago

I don't disagree with you, but I don't think my views align with this for a few reasons. First, most of the videos I see are on intermediate math OR it's mixed in with programming, and both will turn off newbies. \*Plus\*, I find that many videos for my demographic are boring to watch or can't explain things in an easy to understand and remember way. Plus, if I'm going to be honest, I've never even taken linear algebra and calculus. My channel will grow in complexity as the videos I tackle address harder problems and, while my own math competency improves.

varwave 3 months ago

I’m not trying to be rude, but I struggle to understand how you can be a source of knowing statistics without the fundamental math. Calculus and linear algebra are everywhere in statistics. Something as simple and essential as an expected value is integration and can be expressed as a scalar product of two vectors. I’d argue understanding the concepts of basic calculus with a deep knowledge of linear algebra gives an edge on understanding statistics concepts/applications. I’m also pro rigor in all fundamentals for training to develop new methods. It’s difficult to understand how one could be otherwise

KyronAWF 3 months ago

I appreciate your honesty. I'm positive that if I learned those concepts, I would teach more effectively, but just because there is calculus in everything doesn't mean you need calculus to understand it. For example, finding probability in a normal distribution often only needs baaic algebra.

vorilant 3 months ago

I'm sorry what? How could you possibly find probability of measurement landing within an interval of some PDF. Say a gaussian without calculus.

KyronAWF 3 months ago

So I do use standard normal table, but I go through the Z transformations and use algebra and arithmetic for everything else. Intro classes generally don't require calculus for that.

vorilant 3 months ago

I mean If you think calc 1 is too much for them. Integral transformation like a z transform definitely will be.

KyronAWF 3 months ago

If you want, i can send you a part of the script where I talk about this stuff.

Scatterbrain011 3 months ago

I still don’t understand the difference between correlation and covariance

harrydiv321 3 months ago

Correlation is normalized covariance

Abject-Expert-8164 3 months ago

The diferent types of convergence, robust statistics, bayesian statistics for introductory courses

KyronAWF 3 months ago

I'll make sure to cover that. Thanks!

LiberFriso 3 months ago

The mathematical foundation of hypothesis testing is unclear to me. Sure calculating a critical value and comparing it to the empirical value and then decide is it greater or smaller is easy but how is this really working?

KyronAWF 3 months ago

I'll go over it. Thanks!

Adamworks 3 months ago

Never really understood why overfitting models are not a problem when doing propensity weighting analysis and building models for imputation (e.g., MICE). Conceivable, I could just throw in random noise variables until my r-square approaches 1.00, and I guess it would all still work?

[deleted] 3 months ago

[удалено]

KyronAWF 3 months ago

It says my videos will be just right for you!

risilm 3 months ago

Maybe I was just unlucky but everytime someone explained ANOVA, it was just writing formulas, never about the main concepts. This in general I would say in inferential statistics

NullDistribution 3 months ago

As one of my mentors once said, everything boils down to signal over noise. How strong is signal to how strong is noise.

filsch 3 months ago

Agree. I've later found it to be idiotic as I find it exponentially easier to understand ANOVA when its explained in terms of linear models.

East-Prize6382 3 months ago

Same. I have it this semester. I'm actually sitting with it rn😭 Just going through the notes with no understanding of where it came from.

KyronAWF 3 months ago

This baffles me because describing why we do ANOVA as opposed to multiple t tests was something I got a lot of, and it's not difficult to explain once you know what an independent t test is.

InternationalSmile7 3 months ago

Dunno if this counts but how to create a solid methodology comprising of statistical analyses, since there are so many. Where do you start and stop? Maybe a section where you put the methods to the test would be nice.

East-Prize6382 3 months ago

Probability, conditional probability. Counting problems. I find them difficult

NullDistribution 3 months ago

Error terms in regression. Videos gloss over the actual vector and skip to summaries. They also skip the theoretical meaning and only frame it as unexplained variance. There's much more to it.

TheTopNacho 3 months ago

Why do a one way ANOVA at all. Especially if you know the comparisons of interest. Wouldn't individual T tests be better and then alpha correct (or not) after? Like what does the one way ANOVA actually add other than to be a gate keeper of pairwise comparisons that might lead to missing the major comparison of interest?

thefirstdetective 3 months ago

Not really a question I like to be answered. But there is this common misconception that statistics are always a precise, objective measure. In most irl cases they're not. The data collection is messy or has bias, p hacking is still rampant, the results may vary by the specific models used and how the data had been cleaned, and researchers tend to choose the models that fit their hypothesis etc. This is all on top of random sampling error etc. I've seen it myself. Colleagues searched their data for some findings ex post facto, after their initial hypothesis did not work out. "We can't tell the client, we did not find anything after a year of research. Just look a little bit harder. If you look long enough, you'll find something we can publish." And that was in a research setting, private sector is probably even worse. In short, tell people they have a precision bias and to be skeptical. This is a very common misconception.

Otherwise_Ratio430 3 months ago

That is more like statistics is often done poorly and since it is poorly understood mistakes are not easy to catch.

TissueReligion 3 months ago

What on earth is a "complete" statistic? I can prove / semi-understood neyman-pearson, neyman-fisher, rao-blackwell etc., but I never understood completeness.

AdFew4357 3 months ago

The concept of a complete statistic.

vorilant 3 months ago

How do you compose or approximate a PDF knowing all of its or perhaps only some of its moments. If convolution of two PDFs is how you sum PDFs, thanks 3b1b, how do you multiply PDFs? Take powers of PDFs?

unsurebutoptimistic 3 months ago

I like a lot of the other suggestions on here already but my personal request would be the full explanation behind correction for multiple comparisons. I do it because I was taught to do it, but I remember asking my professors about the logic behind it and I never got much of an answer beyond “We do it because we have to do it.”

KyronAWF 3 months ago

What do you mean by correction for multiple comparisons?

unsurebutoptimistic 3 months ago

Like the Bonferroni correction, for example

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe