The resurrection of atari-wiki.com

Latest Atari related news.
zfrenchy
Posts: 111
Joined: 15 Jan 2023 18:14
Location: California, USA

Re: The resurrection of atari-wiki.com

Post by zfrenchy »

This is a genocide !


LOL
User avatar
Icky
Site Admin
Site Admin
Posts: 4375
Joined: 03 Sep 2017 10:57
Location: UK

Re: The resurrection of atari-wiki.com

Post by Icky »

Identified another 312k spam users and their respective pages and revisions.

The pattern seems to be create a spam account, create a page as the same name as the account then once successful go spam the wiki.

Am now getting towards the tail of the dog where things are more nuanced so it will be a little more time cleaning up. It’s always the last percentile that is the difficult challenge.
User avatar
exxos
Site Admin
Site Admin
Posts: 28350
Joined: 16 Aug 2017 23:19
Location: UK

Re: The resurrection of atari-wiki.com

Post by exxos »

I'm busy attacking the spam with some new scripts which the AI mostly wrote :lol: :hide:

I'm attacking it from the SQL database end as the wikitools are beyond garbage. They don't work properly, or out of date, or simply don't work anymore. I really have no idea how such garbage software became so popular.

Anyway, while finding the spam and users is almost straightforward, writing the scripts to delete stuff properly isn't. I'm having to write scripts which involves reverse engineering the mediawiki database :roll: The database is seriously fubared. It doesn't help as there have been multiple spam attacks over the years all with slight variations on what they did. Like username and realname were often identical for spam accounts, but some proper users also did the same :roll:

The good news is ive deleted out about 4.3 million spam posts so far. My scripts are working on the user database currently and figuring out who posted what. Even with a premium server its only doing about 100 SQL calls per second from what I can tell.

My scripts need a LOT of RAM where I ended up with a 140GB swap file in the end to run the scripts :lol: Its NVME storage so probably faster than the actual RAM anyway :lol:

b.jpg
You do not have the required permissions to view the files attached to this post.
zfrenchy
Posts: 111
Joined: 15 Jan 2023 18:14
Location: California, USA

Re: The resurrection of atari-wiki.com

Post by zfrenchy »

Thank you for all your hard work.
User avatar
exxos
Site Admin
Site Admin
Posts: 28350
Joined: 16 Aug 2017 23:19
Location: UK

Re: The resurrection of atari-wiki.com

Post by exxos »

zfrenchy wrote: 14 Nov 2023 00:19 Thank you for all your hard work.
:thanksyellow:
User avatar
exxos
Site Admin
Site Admin
Posts: 28350
Joined: 16 Aug 2017 23:19
Location: UK

Re: The resurrection of atari-wiki.com

Post by exxos »

So this is the speed its going at..



I assume the more it deletes the faster it will slowly become.

:ball:
User avatar
chronicthehedgehog
Site sponsor
Site sponsor
Posts: 383
Joined: 08 May 2022 18:11
Location: The Midlands

Re: The resurrection of atari-wiki.com

Post by chronicthehedgehog »

That seems faster than deleting varbinary data from Azure SQL :lol:

Are you doing it in batches or one by one?
User avatar
exxos
Site Admin
Site Admin
Posts: 28350
Joined: 16 Aug 2017 23:19
Location: UK

Re: The resurrection of atari-wiki.com

Post by exxos »

chronicthehedgehog wrote: 14 Nov 2023 11:44 Are you doing it in batches or one by one?
It depends how you look at it.. I can only do things one by one because of the script can only do one thing at a time..

Though like I said earlier, there are "several" spam attack methods going on so I can only process one attack at a time. I have to write a custom script for each attack method then do a lot of debugging to try make sure is not deleting any actual content.

Mostly the spam attacks seem to start happening after @troed posted the "STE scanlines" article. So I manually looked through the database for posts after but of course with millions of them is just not feasible to do the entire lot. Basically anything before that article is considered "safe" but there are also some spam post starting to appear before that. So I will have to likely process them manually. I have got some safe keywords such as "atari" so it won't delete those. But of course you cannot guarantee every article will have "atari" in them. Similar with "<pre>" formatted text.. All those are considered safe because the spam accounts things seem to be using that.

There are of course a lot of invalid characters which could likely be spam posts but I have to write a test script to filter those out. In theory only English characters should be in the wiki, but I have also noticed several posts which use "invalid" characters as well :roll: .

Currently I am scanning the users database and deleting out any users which have not posted anything. There are a LOT of accounts like that. So it helps to safely reduce the spam accounts without losing content. It will likely delete a few genuine user accounts but after all these years they are inactive and have not posted anyway so...

The majority of the spam attacks seem to have come from "guest" access to the wiki. It was like guests could post just with a username and nothing else. But its all a huge mess anyway. I have safeguarded guest account as much as possible. But in theory guest account should not be posting genuine content anyway IMO.. So it is very difficult to come up with scripts to safeguard against unknown content.

But most of the spam so far is like this anyway...

Capture.PNG

So if the exact same post is listed multiple times.. its spam. That is what one script looks for.

Another problem is because the scripts use such a lot of RAM, I cannot run multiple scripts at the same time. Though I think doing that would ultimately half the speed the SQL statements are running at anyway.
You do not have the required permissions to view the files attached to this post.
troed
Posts: 936
Joined: 21 Aug 2017 22:27

Re: The resurrection of atari-wiki.com

Post by troed »

exxos wrote: 14 Nov 2023 12:02 Mostly the spam attacks seem to start happening after @troed posted the "STE scanlines" article.
I am deeply sorry!

:D
User avatar
exxos
Site Admin
Site Admin
Posts: 28350
Joined: 16 Aug 2017 23:19
Location: UK

Re: The resurrection of atari-wiki.com

Post by exxos »

troed wrote: 14 Nov 2023 12:21 I am deeply sorry!
:awwww:

Now its getting more tidy, the latest seems to be a post by Jon Cove :shrug:

2.PNG

CPU getting a bit warm now I bet :lol:

Capture.PNG

And as a teaser...

1.PNG

I'm not really sure what the original Atari wiki logo was I thought it was a blue Fuji but I haven't found it in any of the files yet. Maybe we should have a new one now...
You do not have the required permissions to view the files attached to this post.

Return to “NEWS & ANNOUNCEMENTS”

Who is online

Users browsing this forum: alexh, Bing [Bot], ClaudeBot and 2 guests