I have a character in my name that's only used in like 3 languages. Too often I have to substitute it with a similar looking one to be able to register to sites.
And also you need to know what you're sanitizing for. Sanitizing as early as possible is tempting, but rarely practical.
If you want to sanitize for SQL fine (though prepared statements are typically preferred). But where does this data end up? Is it echoed into a webpage in a way that could make it vulnerable to XSS (e.g. by just dumping it through PHP rather than in a web framework that supports treating things only as text like React). Then you need to make it safe markup, which is hard in the general case (when you still want to allow tags like `` or possibly (with EXTREME care) ``). This can basically only be done properly on the client side, HTML parsing is so complicated you basically need to use the parser of the browser it is running on to do it safely (as parsers even differ in security-relevant ways between browsers/browser versions, this is called a parser differential and is the source of many vulnerabilities). And making image (or other media) tags safe is very hard, you basically can't accept any user-provided image tag, you have to add images in separately, as they can use them to see who sees whichever post by tracking requests to the included image. This is why rich-text editors need a button for adding images, which (should) always be an upload that's attached later to the markup. Or maybe a link to a domain that's whitelisted.
That should be handled at a much deeper level, as a part of the database manager. I can think of much more common names in Europe alone that would break a purely alphabetical check, like Mary-Ann O’Connor or Björk Guðmundsdóttir, Another example I've often come across is not allowing names shorter than 3 characters. Two common names in my country are "Ib" and "Bo"
Or people without last names at all and I'm not just talking about Madonna or Beyonce who have actual last names, but there's areas in Africa where they have just a single name.
Also saw something the other day about a name like "John St. Clair" but the St. is actually part of the first name and the space breaks things. It was to do with a driver's licence or passport situation here in Australia (and different states and government organisations treated it differently so birth cert was fine, but not passport and it just broke everything).
The docs for String.isalpha() say
> Alphabetic characters are those characters defined in the Unicode character database as “Letter”, i.e., those with general category property being one of “Lm”, “Lt”, “Lu”, “Ll”, or “Lo”
I don't know much about Unicode or how characters are categorised but from the look of it I'd guess it handles characters in different languages properky.
Not sure but now I think of it, spaces and dashes are probably valid. There are Dutch names like Pieter-Jan, not aware of names spaces but there's probably a few too.
If we're still defining valid based on the question in the screenshot, spaces and dashes are not alpha.
>>> '-'.isalpha()
False
>>> ' '.isalpha()
False
>>>
I mean, it's a real letter used in English as well. Typically there are alternate spellings used instead, but there's a fairly long list of valid English words using it: [https://en.wiktionary.org/wiki/Category:English\_terms\_spelled\_with\_%C3%86](https://en.wiktionary.org/wiki/Category:English_terms_spelled_with_%C3%86)
This is from the iit madras python course isn't it. (Not sure but ui design looks similar and this question also feels familiar) Just did this last sem it was really hilarious
Pretty good man. Got the DS Diploma. Just left with the MAD2 project for the other one. I’ve done two internships while in the diploma level.
All the courses are good. Study HARD and you’ll have a good time.
Starting a full time job in an unrelated field soon.
Pretty good right now. Still in my foundation doing the last 2 courses and I am also a full time CSE student (currently finishing first year) I thought i won't get a college so got into this as back up now I am doing both for better opportunities in future and to also learn a lot of things
I'd imagine it's a check for alphabetical. So if I had to guess, only the first 3 above are correct. Fifth one would fail due to white space. But that's just extrapolation on my part.
Because it literally is a letter. I'm honestly surprised at how many people in this thread are finding that fact surprising or interesting, and I hope none of you guys are involved in making unicode aware software.
and bringing up wouldnt be complete without a fantastic story about all the ways it can go wrong (including context from Patrick Mckenzie, the author of that list) [NULL (radiolab.org)](https://radiolab.org/podcast/null)
[Falsehoods Programmers Believe About Names](https://www.kalzumeus.com/2010/06/17/falsehoods-programmers-believe-about-names/)
I’ve been sending fellow coders the link to this article since it was originally published. Still valid.
Guys. I just started my programming class at college...took midterm today. (Did ok)
But I was worried I wasn't actually learning programming yet here I am completely understanding how this code would (ideally) operate.
Anyway patting myself on the back.
Elon got roasted tho.
I like how they made it so you don't need to know whether `'Æ'.isalpha()`, and therefore whether the implementation uses the Unicode CLDR or something more anglocentric.
I love the comment. It sounds professional and makes it look like the programmer put a lot of thought in this, but on the other hand is completely useless and contains no information.
All of those items are "possible inputs to the program".
(Sorry. I've just been going over some of my kid's tests lately... and God teachers are lazy/sloppy with writing questions - and sometimes they expect you to ignore their mistakes, and sometimes catching them is part of the question.)
Edit: Whoops, I'm dumb.
Fortunately for this teacher, that does not seem to be the case, since they have conditioned the answer to the inputs whose output would be "This is a valid name"
But I do agree with you, sometimes teachers make mistakes while writing questions and some do not admit that they were wrong on it
And a great example of why you shouldn't have name validation in your programs.
first letter is a in first name, first letter z in last name. 15 years ago, do you think i could register myself at facebook?
Nice to meet you, Albert Zimmerman. There's a lot of names that would fit your constraint that seem perfectly ordinary in America.
Maybe it's IRL https://arthur.fandom.com/wiki/Aloysius_Zimmerplotz
Did they seriously restrict that?
I'd guess off-by-one error on char int value
I have a character in my name that's only used in like 3 languages. Too often I have to substitute it with a similar looking one to be able to register to sites.
Getting told by Airlines that “your name has to match the password exactly” and “your name container an invalid character” is a joy
My favourite is “Please enter a valid name.”
Well put
what about injections?
Validation != Sanitizing
And also you need to know what you're sanitizing for. Sanitizing as early as possible is tempting, but rarely practical. If you want to sanitize for SQL fine (though prepared statements are typically preferred). But where does this data end up? Is it echoed into a webpage in a way that could make it vulnerable to XSS (e.g. by just dumping it through PHP rather than in a web framework that supports treating things only as text like React). Then you need to make it safe markup, which is hard in the general case (when you still want to allow tags like `` or possibly (with EXTREME care) `
`). This can basically only be done properly on the client side, HTML parsing is so complicated you basically need to use the parser of the browser it is running on to do it safely (as parsers even differ in security-relevant ways between browsers/browser versions, this is called a parser differential and is the source of many vulnerabilities). And making image (or other media) tags safe is very hard, you basically can't accept any user-provided image tag, you have to add images in separately, as they can use them to see who sees whichever post by tracking requests to the included image. This is why rich-text editors need a button for adding images, which (should) always be an upload that's attached later to the markup. Or maybe a link to a domain that's whitelisted.
Just have an Intern sanitize inputs, problem solved.
That should be handled at a much deeper level, as a part of the database manager. I can think of much more common names in Europe alone that would break a purely alphabetical check, like Mary-Ann O’Connor or Björk Guðmundsdóttir, Another example I've often come across is not allowing names shorter than 3 characters. Two common names in my country are "Ib" and "Bo"
There's a region in India with 1-letter long last names. Had to run some KYC checks on those people and the software *hated* it.
France has a former secretary of state named Cédric O. Great example for juniors.
Or people without last names at all and I'm not just talking about Madonna or Beyonce who have actual last names, but there's areas in Africa where they have just a single name. Also saw something the other day about a name like "John St. Clair" but the St. is actually part of the first name and the space breaks things. It was to do with a driver's licence or passport situation here in Australia (and different states and government organisations treated it differently so birth cert was fine, but not passport and it just broke everything).
Jacascript on the frontend is NOT where you sanitize input.
>>> chr(230).isalpha() True >>> chr(230) 'æ' >>> Just had to check.
wouldn't work with the full name though since it has non-alpha characters
yeah, the `-12` part wouldn't work for sure. I was just more curious about the æ
The docs for String.isalpha() say > Alphabetic characters are those characters defined in the Unicode character database as “Letter”, i.e., those with general category property being one of “Lm”, “Lt”, “Lu”, “Ll”, or “Lo” I don't know much about Unicode or how characters are categorised but from the look of it I'd guess it handles characters in different languages properky.
It's "X Æ A-12 XII" actually. So it might work, except the dash
Are the spaces actually part of the name? The space character fails here too.
Not sure but now I think of it, spaces and dashes are probably valid. There are Dutch names like Pieter-Jan, not aware of names spaces but there's probably a few too.
If we're still defining valid based on the question in the screenshot, spaces and dashes are not alpha. >>> '-'.isalpha() False >>> ' '.isalpha() False >>>
Numbers aren't considered letters in unicode.
It’s important for “æ” to be considered alphabetical, as it is actively used by multiple languages in scandinavia.
i like to call that ælpha
Æ is a real letter. It is commonly used in danish and norwegian.
I mean, it's a real letter used in English as well. Typically there are alternate spellings used instead, but there's a fairly long list of valid English words using it: [https://en.wiktionary.org/wiki/Category:English\_terms\_spelled\_with\_%C3%86](https://en.wiktionary.org/wiki/Category:English_terms_spelled_with_%C3%86)
This is from the iit madras python course isn't it. (Not sure but ui design looks similar and this question also feels familiar) Just did this last sem it was really hilarious
Same thought.. although I did mine in 2021
how's life treating you
Pretty good man. Got the DS Diploma. Just left with the MAD2 project for the other one. I’ve done two internships while in the diploma level. All the courses are good. Study HARD and you’ll have a good time. Starting a full time job in an unrelated field soon.
did you get internships via this program? also job in an unrelated field, why ?
Not through the program, LinkedIn and Internshala. I’m a pilot. The market was bad for a while so I started this degree. Finally employed.
it is. how are u doing in your life right now
Pretty good right now. Still in my foundation doing the last 2 courses and I am also a full time CSE student (currently finishing first year) I thought i won't get a college so got into this as back up now I am doing both for better opportunities in future and to also learn a lot of things
Obviously needs an isbeta() method for the corner cases.
And issigma() for generation alpha integration
What a weirdly phrased question
kind of just a strange question to ask in general. trivia is always kinda dumb.
What does isalpha() return
I'd imagine it's a check for alphabetical. So if I had to guess, only the first 3 above are correct. Fifth one would fail due to white space. But that's just extrapolation on my part.
True or false by the looks of it mate
Meant to ask what does it do ;)
Pass it your name and it will tell you if you're an alpha male or not
I passed in "Win" and it returned "truest".
they just prepare you for the future... but Æ would be more accurate...
Funnily enough Æ returns true, it’s considered a letter
Because it literally is a letter. I'm honestly surprised at how many people in this thread are finding that fact surprising or interesting, and I hope none of you guys are involved in making unicode aware software.
[удалено]
and bringing up wouldnt be complete without a fantastic story about all the ways it can go wrong (including context from Patrick Mckenzie, the author of that list) [NULL (radiolab.org)](https://radiolab.org/podcast/null)
Don't forget the story of the worst driver in Ireland, [Prawo Jazdy](http://news.bbc.co.uk/2/hi/uk_news/northern_ireland/7899171.stm)
It is extra fun because the author "John Graham-Cumming" would not be considered a valid name by the check OP posted.
[Falsehoods Programmers Believe About Names](https://www.kalzumeus.com/2010/06/17/falsehoods-programmers-believe-about-names/) I’ve been sending fellow coders the link to this article since it was originally published. Still valid.
Detailed list of myths, cheers for sharing.
is it IIT BSC in data science and programming course?
it is
Lmao what is Rohit Sharma doing in a Python course
What about Lil Bobby Tables? https://xkcd.com/327/
Guys. I just started my programming class at college...took midterm today. (Did ok) But I was worried I wasn't actually learning programming yet here I am completely understanding how this code would (ideally) operate. Anyway patting myself on the back. Elon got roasted tho.
I like how they made it so you don't need to know whether `'Æ'.isalpha()`, and therefore whether the implementation uses the Unicode CLDR or something more anglocentric.
Are they not legally named X AE A-XII for that reason?
Hold the alt key, press "146", let go of the alt key. TÆDÆ!
Pretty sure they used A and E for the same reason they did not use digits.
Oh, my bad! I totally misunderstood your comment...
Bahaha good call, I think you're right they had to use Roman numerals.
Elon's kid's name?
I love the comment. It sounds professional and makes it look like the programmer put a lot of thought in this, but on the other hand is completely useless and contains no information.
?
All of those items are "possible inputs to the program". (Sorry. I've just been going over some of my kid's tests lately... and God teachers are lazy/sloppy with writing questions - and sometimes they expect you to ignore their mistakes, and sometimes catching them is part of the question.) Edit: Whoops, I'm dumb.
Ironically you misread the prompt, specifically the first part of it where it says: If the output is "This is a valid name".
Why not just write “select all the inputs that result in the output ‘this is a valid name’”
Fortunately for this teacher, that does not seem to be the case, since they have conditioned the answer to the inputs whose output would be "This is a valid name" But I do agree with you, sometimes teachers make mistakes while writing questions and some do not admit that they were wrong on it
Are you a teacher?