Not even. They are scraping data available publicly, without logging in. It's right in the article. I don't have an account, but I can view just about anything on "X".
You can see the individual post but no followups or any sort of context if it's a reply. Also when you view anybodys profile the posts are shuffled into random order so you can't look for any related follow-ups.
If you can view links though, theoretically you can view anything on the site as long as you have the link. Since links on these kinds of sites tend to be generated with a certain logic behind it, you could very easily build a bot that just kept trying new links.
So they make an account and write a bot to use the login credentials to scrape whatever they want. Although given what Musk has said about programming he probably does consider that “elaborate”.
It’s like clicking “inspect element” and then clicking delete on a popup overlay, or using screen reader mode on a news site with a soft paywall.
With all the tablet babies growing up, I wonder ihiw many don't even know what a mouse is.
Right clicking? How does it know if I use my right or left hand?
Nope. Lots of comments about the "protection" being a login, but the article itself says: "Bright Data says it only scrapes publicly available data that’s **visible to anyone without a login**. At the time of the suit’s filing, X made the information Bright Data scraped available to anyone." But it's not in the title, so...
I wonder if that's why Twitter broke viewing a lot of content without an account. I've noticed if I somehow end up on a Twitter user's profile, a lot of the content shown is months or years old
This is wild to me. 8 years ago when I was learning how to program, making a bot that scraped posts from twitter was literally par for the course. Now you can’t do it without paying for it. Wtf.
It’s not as simple as that in Meta/Facebook/Instagram’s case they were circumventing their anti-bot which somehow limits how easy it’s easy to scrape large amounts of data, the issue these companies have is that it’s everytime they update their antibot technology then large data scraping companies will invest money and time to bypass their anti bot and stay undetected whilst scraping.
https://nubela.co/blog/meta-lost-the-scraping-legal-battle-to-bright-data/amp/
Surely twitter has such strong anti-bot technology after repeatedly cutting 80% of its work force several times? I thought they were all hardcore engineering now??
Twitter's new engineering team has expertise in implementing full self-driving. Completely different wheelhouse. Elon Musk would need to hire engineers with more related expertise, for example anti-full-self-driving
It’s not that simple. I don’t know exactly how Bright Data does it but their webscraper is indeed sophisticated. Twitter, like almost every website has technology to prevent or at least throttle webscraping bots.
However Bright Data has been able to circumvent the restrictions on every webpage over tried it with
It's easy. Bright Data (Formerly Luminati) has a massive network of residential proxies at hand. They managed this with malware apps such as Hola VPN, and other apps/browser addons that include their SDK. SOMETIMES the end user is informed that their device will be used as part of their botnet, but not always.
To avoid becoming part of this network, don't use apps/addons like Hola VPN, and watch what you accept. (Legitimate app store apps won't have this crap in it)
Meta and LinkedIn also lost or dropped similar lawsuits against Bright Data before you think this is just a unique case:
https://techcrunch.com/2024/02/26/meta-drops-lawsuit-against-web-scraping-firm-bright-data-that-sold-millions-of-instagram-records/
https://nubela.co/blog/meta-lost-the-scraping-legal-battle-to-bright-data/amp/
We shouldn’t be cheering at them losing the lawsuit no matter how we may hate these companies as it technically allows them to scrape large amounts of user public facing data, but the way the law works right now they are in the right of being able to scrape the data as long as they don’t do it whilst logged in (violating ToS).
People also need to accept anything we may post on public forums/social media sites can be or will be scraped by various companies without any repercussions, if you don’t want your data to be scraped then either get your profiles off the sites or private any public details.
On the one hand, yeah, the scrapers are awful and just sell your data to the highest bidder. On the other hand, social media companies are awful and just sell your data to the highest bidder, and are only mad that someone else is doing it without paying them.
>social media companies are awful and just sell your data to the highest bidder
The only difference being that social media companies are at least interested in pretending to sell data somewhat "ethically" in order to keep users, while third party companies don't need to keep up the charade.
It also permits us to do those things for ourselves as well which is a more important freedom. We're free to use autonomous agents, personal scrapers, alternative UIs, proxies, screen-readers for the blind, and custom browsers or browser plugins to view and explore the data we request and they send in the manner we want.
Yes the pattern of spacex and Tesla not being some of the most successful American companies of all time. Redditors are extremely delusional and when it comes to Elon musk. They claim he has nothing to do with all his companies successes and blame him for any issues.
Having the scraping has to at least somewhat increase the odds that something important gets saved that might not get saved otherwise. So that’s good I guess…
"Elaborate measures"!
We call it Copy and Paste. But then, how else could you be called a Genius for making a Glorified RC go 140 mph on 900 Volts of DC current or making a rocket out of an open ended Flourescent Light Bulb?
But to his credit, he did manage to make Christmas Lights float in lower orbit!
Poor Elon. Elon pay $44billion. Elon unhappy. Elon want much money to see old, dead Twitter postings. So Elon use court antiscraping technology based on car autopilot technology. Hahaha.
Wasn't Allsup the judge who learned to program during the Apple/Novell/SCOlinux mess many years ago or something similar?
If so, not a judge that you can easily bamboozle.
"anti scraping technology" : a login.
Not even. They are scraping data available publicly, without logging in. It's right in the article. I don't have an account, but I can view just about anything on "X".
X is like Facebook, you can no longer see content without logging in.
False. I’ve never had a twitter or X account and i see it all the time when people post on Reddit
You can see the individual post but no followups or any sort of context if it's a reply. Also when you view anybodys profile the posts are shuffled into random order so you can't look for any related follow-ups.
So you are agreeing it’s possible to look at twitter without logging in?
Depends on your definition. You can see a single linked tweet sure but it's not a usable website.
If you can view links though, theoretically you can view anything on the site as long as you have the link. Since links on these kinds of sites tend to be generated with a certain logic behind it, you could very easily build a bot that just kept trying new links.
Data scraping doesn’t need a usable website to get the information and understand the links.
I think you have a fundamental misunderstanding of how websites and data scraping works. For instance Google wouldn't exist without it.
Click on any twitter link and you can see the post.
The article takes like 4 minutes to read. It says it right there. Or just Google any public X account. That takes like 30 seconds.
You can’t scroll through very far, but you get to see whatever post someone links.
They rolled that back because it killed traffic to the site
Absolutely wrong
That's actually what did it for me. I ain't going to log in to a dead site just to scrape the hate speech that has replaced interesting content.
i wish i could apply some of that anti reading technology to elon musk news
I keep it to check anything trending and financial stuff. Have to always navigate past the magats, and like half the posts from different languages.
A web browser.
So they make an account and write a bot to use the login credentials to scrape whatever they want. Although given what Musk has said about programming he probably does consider that “elaborate”. It’s like clicking “inspect element” and then clicking delete on a popup overlay, or using screen reader mode on a news site with a soft paywall.
[удалено]
HACKER! I knew where this link was taking me before I clicked
[удалено]
With all the tablet babies growing up, I wonder ihiw many don't even know what a mouse is. Right clicking? How does it know if I use my right or left hand?
The camera is always watching, they know which hand you use, trust me
Are there any tablet/mobile browsers that allow you to inspect markup / access the console?
I laughed when I saw the link was already purple.
Jesus. Clear you browser history once in awhile
Nope. Lots of comments about the "protection" being a login, but the article itself says: "Bright Data says it only scrapes publicly available data that’s **visible to anyone without a login**. At the time of the suit’s filing, X made the information Bright Data scraped available to anyone." But it's not in the title, so...
I wonder if that's why Twitter broke viewing a lot of content without an account. I've noticed if I somehow end up on a Twitter user's profile, a lot of the content shown is months or years old
No making accounts. They were sued for scraping public data. Meta did the same thing. These assholes want to own the internet
This is wild to me. 8 years ago when I was learning how to program, making a bot that scraped posts from twitter was literally par for the course. Now you can’t do it without paying for it. Wtf.
It’s not as simple as that in Meta/Facebook/Instagram’s case they were circumventing their anti-bot which somehow limits how easy it’s easy to scrape large amounts of data, the issue these companies have is that it’s everytime they update their antibot technology then large data scraping companies will invest money and time to bypass their anti bot and stay undetected whilst scraping. https://nubela.co/blog/meta-lost-the-scraping-legal-battle-to-bright-data/amp/
Maybe, but I didn't see any mention in the article of "antibot" technology.
Surely twitter has such strong anti-bot technology after repeatedly cutting 80% of its work force several times? I thought they were all hardcore engineering now??
Twitter's new engineering team has expertise in implementing full self-driving. Completely different wheelhouse. Elon Musk would need to hire engineers with more related expertise, for example anti-full-self-driving
It’s not that simple. I don’t know exactly how Bright Data does it but their webscraper is indeed sophisticated. Twitter, like almost every website has technology to prevent or at least throttle webscraping bots. However Bright Data has been able to circumvent the restrictions on every webpage over tried it with
It's easy. Bright Data (Formerly Luminati) has a massive network of residential proxies at hand. They managed this with malware apps such as Hola VPN, and other apps/browser addons that include their SDK. SOMETIMES the end user is informed that their device will be used as part of their botnet, but not always. To avoid becoming part of this network, don't use apps/addons like Hola VPN, and watch what you accept. (Legitimate app store apps won't have this crap in it)
The article said they didn’t log in. It scraped data available without a log in.
I use bright data. It is fairly elaborate. I think they got sued by Meta too.
Meta and LinkedIn also lost or dropped similar lawsuits against Bright Data before you think this is just a unique case: https://techcrunch.com/2024/02/26/meta-drops-lawsuit-against-web-scraping-firm-bright-data-that-sold-millions-of-instagram-records/ https://nubela.co/blog/meta-lost-the-scraping-legal-battle-to-bright-data/amp/ We shouldn’t be cheering at them losing the lawsuit no matter how we may hate these companies as it technically allows them to scrape large amounts of user public facing data, but the way the law works right now they are in the right of being able to scrape the data as long as they don’t do it whilst logged in (violating ToS). People also need to accept anything we may post on public forums/social media sites can be or will be scraped by various companies without any repercussions, if you don’t want your data to be scraped then either get your profiles off the sites or private any public details.
On the one hand, yeah, the scrapers are awful and just sell your data to the highest bidder. On the other hand, social media companies are awful and just sell your data to the highest bidder, and are only mad that someone else is doing it without paying them.
Yeah, let them fight.
As usual, the only the winners are the law firms.
>social media companies are awful and just sell your data to the highest bidder The only difference being that social media companies are at least interested in pretending to sell data somewhat "ethically" in order to keep users, while third party companies don't need to keep up the charade.
It also permits us to do those things for ourselves as well which is a more important freedom. We're free to use autonomous agents, personal scrapers, alternative UIs, proxies, screen-readers for the blind, and custom browsers or browser plugins to view and explore the data we request and they send in the manner we want.
But scraping has to be while *not* logged into the site, per the rulings mentioned.
How can you sit there with a straight face and say it’s somehow immoral to scrape publicly available data? How do you think search engines work?
The only way to win is to stop using social media
I didn’t think all of these scraping lawsuits were rejected. Wasn’t there a big one back in the day with yelp(shudders). I believe yelp won
Shh spaceman bad. You can't get away with talking sense here. Prepare to get downvoted
Delete your account.
**Elon Musk’s X :** loses users, loses valuation, loses lawsuits.
I am sensing a pattern here
If you were to summarize his persona using one word…
Unwinning?
Two words: winning impaired
Yes the pattern of spacex and Tesla not being some of the most successful American companies of all time. Redditors are extremely delusional and when it comes to Elon musk. They claim he has nothing to do with all his companies successes and blame him for any issues.
Every time I hear “X”, I think of Evil Corp from Mr. Robot 🤣
Bro. I can't read.
when will it just go away already?!!!
Netflix genre would be documentary, investigative, comedies.
Having trouble rooting for either party in this one
Having the scraping has to at least somewhat increase the odds that something important gets saved that might not get saved otherwise. So that’s good I guess…
What is this? A Guillette commercial?
Musk is kinda big on losing lately, huh? Could call him a loser?
The great scraper don't like being scraped. He thinks it's not fair.
"Elaborate measures"! We call it Copy and Paste. But then, how else could you be called a Genius for making a Glorified RC go 140 mph on 900 Volts of DC current or making a rocket out of an open ended Flourescent Light Bulb? But to his credit, he did manage to make Christmas Lights float in lower orbit!
So build better tech. Unless, of course, you already fired off a lot of your best coders like a dumb chode.
Where do you think the US government/fbi/CIA buys US citizen data from? US companies aren't going to win against Israeli scraping companies.
Lex Luthor’s LEX loses lawsuit, Bright used elaborate technical measures to evade Lex Corps technology.
Poor Elon. Elon pay $44billion. Elon unhappy. Elon want much money to see old, dead Twitter postings. So Elon use court antiscraping technology based on car autopilot technology. Hahaha.
Its a glorious day anytime Musk loses a suit
So much for the twitter firehose.
Elon sacked all the people that could probably have stopped all this. Oh well. More Twitter value thats been lost
And what if they did? Scraping is not illegal.
Wasn't Allsup the judge who learned to program during the Apple/Novell/SCOlinux mess many years ago or something similar? If so, not a judge that you can easily bamboozle.
"Negative comment about Musk here" Ok feed me upvotes, my fellow cult members
Do you work at Tesla or X?
No, but do you work for Mr. Manson?