Commons:Bots/Requests
If you want to run a bot on Commons, you must get permission first. To do so, file a request following the instructions below.
Please read Commons:Bots before making a request for bot permission.
I | Create a user account (while logged in to your normal account) and user page for the bot
On the bot's userpage, add {{Bot}}, which automatically adds the page to Category:Commons bots. Then add the following information to the bot's userpage (all this is mandatory):
|
---|---|
II | Create your bot request:
Add your bot request to the list here:
|
III | Test run
You can be demanded to make a short test run with your bot account (30–50 edits/uploads) to allow other users to review your bot's tasks. Unauthorized test run is not allowed. |
IV | Waiting for approval.
You now need to wait for community approval. A bureaucrat will close the request and will also grant a bot flag, where necessary. Closed requests are moved to Commons:Bots/Archive. |
|
Requests made on this page are automatically transcluded in Commons:Requests and votes for wider comment.
Requests for permission to run a bot[edit]
Before making a bot request, please read the new version of the Commons:Bots page. Read Commons:Bots#Information on bots and make sure you have added the required details to the bot's page. A good example can be found here.
When complete, pages listed here should be archived to Commons:Bots/Archive.
Any user may comment on the merits of the request to run a bot. Please give reasons, as that makes it easier for the closing bureaucrat. Read Commons:Bots before commenting.
Lucasbelo.bot (talk · contribs)[edit]
Operator: Lucas.Belo (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)
Bot's tasks for which permission is being sought: Insertion of SDC in images from Wiki Loves Monuments missing SDC participant in.
Automatic or manually assisted: Automatic (supervised)
Edit type (e.g. Continuous, daily, one time run): one time run
Maximum edit rate (e.g. edits per minute): between 40 and 50 edits per minute
Bot flag requested: (Y/N): Y
Programming language(s): OpenRefine
Lucasbelo.bot (talk) 01:28, 22 February 2024 (UTC)
- Discussion
- Generally supportive of the task. Could you please confirm that you have read and understood Commons:Bots-Policy and will follow it moving forward. Thank you. --Schlurcher (talk) 08:33, 22 February 2024 (UTC)
- I have read, understood and will follow the Bot account policies. Lucas.Belo (talk) 23:24, 23 February 2024 (UTC)
- Please make test run. Please do not use bot account for unrelated activities like creating this page. --EugeneZelenko (talk) 16:39, 22 February 2024 (UTC)
- Here are some edits [1].
- Looks OK for me if same code was used there. --EugeneZelenko (talk) 16:04, 24 February 2024 (UTC)
- Ok, I'll be more careful when making edits with the bot account. Lucas.Belo (talk) 23:43, 23 February 2024 (UTC)
- Here are some edits [1].
- Who is theb operator of this bot? --Krd 17:19, 22 February 2024 (UTC)
- Found it on metawiki: User:Lucas.Belo. --Achim55 (talk) 18:42, 23 February 2024 (UTC)
Leaderbot (talk · contribs)[edit]
Operator: Leaderboard (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)
Bot's tasks for which permission is being sought:: Upload graphs for meta:Global statistics
Automatic or manually assisted:: Automatic
Edit type (e.g. Continuous, daily, one time run): Daily
Maximum edit rate (e.g. edits per minute): ~1000/day
Bot flag requested: (Y/N): Y
Programming language(s): Python
I'm thinking of introducing graphs for Global Statistics, and (due to issues with the Graph extension) the only option is to upload images - which I plan to do on Commons. The graphs would be SVGs, and all graphs would have a common category. The account already has a bot flag and has been running for about a year on Meta.
Leaderboard (talk) 06:52, 13 February 2024 (UTC)
- Discussion
- Please create bot's user and talk pages and make test run. --EugeneZelenko (talk) 16:41, 13 February 2024 (UTC)
- @EugeneZelenko: , the bot does have a global user page (it just doesn't have a local account) - can I route the user talk page to mine? Regarding test runs, is there a specific limit (say 50)? Leaderboard (talk) 16:58, 13 February 2024 (UTC)
- Bot should have local user page with information about owner. See other bots as example. It's fine to redirect bot's talk page to yours. There is no specific limit, but 50 is too much on my mind. --EugeneZelenko (talk) 16:29, 14 February 2024 (UTC)
- Hi @EugeneZelenko: would a test on the meta:Beta Cluster (i.e, https://commons.wikimedia.beta.wmflabs.org/wiki/Main_Page) be enough? Leaderboard (talk) 16:50, 23 February 2024 (UTC)
- What is preventing uploads directly on Commons? --EugeneZelenko (talk) 16:17, 24 February 2024 (UTC)
- @EugeneZelenko: I'm asking mostly as a way to alleviate your concern on "There is no specific limit, but 50 is too much on my mind". Normally the testing process involves running it on the Beta Cluster, and then switching it to the main sites if things are OK on the cluster. The Meta people was fine on that; didn't know what Commons' preference was. If you want me to run the test directly I can do so. Leaderboard (talk) 05:57, 25 February 2024 (UTC)
- Please test on Commons. Krd 15:48, 26 February 2024 (UTC)
- @Krd: Got it. This is likely to take a few days since I'm busy at the moment; I'll ping both of you when I get it done. Leaderboard (talk) 06:12, 27 February 2024 (UTC)
- @Krd and EugeneZelenko: Done, though the bot couldn't create pages due to it being hit by captchas - I've requested temp confirmed access at Commons:Requests_for_rights#Confirmed. As a result, the bot only uploaded the images. Leaderboard (talk) 18:05, 4 March 2024 (UTC)
- Now that the bot has confirmed rights, test Done Leaderboard (talk) 18:24, 4 March 2024 (UTC)
- @Krd: Got it. This is likely to take a few days since I'm busy at the moment; I'll ping both of you when I get it done. Leaderboard (talk) 06:12, 27 February 2024 (UTC)
- Please test on Commons. Krd 15:48, 26 February 2024 (UTC)
- @EugeneZelenko: I'm asking mostly as a way to alleviate your concern on "There is no specific limit, but 50 is too much on my mind". Normally the testing process involves running it on the Beta Cluster, and then switching it to the main sites if things are OK on the cluster. The Meta people was fine on that; didn't know what Commons' preference was. If you want me to run the test directly I can do so. Leaderboard (talk) 05:57, 25 February 2024 (UTC)
- What is preventing uploads directly on Commons? --EugeneZelenko (talk) 16:17, 24 February 2024 (UTC)
- Hi @EugeneZelenko: would a test on the meta:Beta Cluster (i.e, https://commons.wikimedia.beta.wmflabs.org/wiki/Main_Page) be enough? Leaderboard (talk) 16:50, 23 February 2024 (UTC)
- Bot should have local user page with information about owner. See other bots as example. It's fine to redirect bot's talk page to yours. There is no specific limit, but 50 is too much on my mind. --EugeneZelenko (talk) 16:29, 14 February 2024 (UTC)
- @EugeneZelenko: , the bot does have a global user page (it just doesn't have a local account) - can I route the user talk page to mine? Regarding test runs, is there a specific limit (say 50)? Leaderboard (talk) 16:58, 13 February 2024 (UTC)
DaxBot (talk · contribs)[edit]
Operator: DaxServer (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)
Bot's tasks for which permission is being sought: Task #2 - Upload images from en:UCLA Library which are deemed Public Domain. "The UCLA Library Digital Collections includes rare and unique digital materials developed by the UCLA Library to support education, research, service, and creative expression." – https://digital.library.ucla.edu/ Example: https://digital.library.ucla.edu/catalog/ark:/21198/z10g97m3
Automatic or manually assisted: Supervised automatic Manual
Edit type (e.g. Continuous, daily, one time run): Regular (not continuous)
Maximum edit rate (e.g. edits per minute): 5
Bot flag requested: (Y/N): N
Programming language(s): Python
I have not yet started any development work on this as I'll be awaiting for a test run approval. -- DaxServer (talk) 11:32, 7 January 2024 (UTC)
- Discussion
- Test runs is fine. Please make it. --EugeneZelenko (talk) 15:39, 7 January 2024 (UTC)
- @EugeneZelenko Test run is done with 25 images - Category:Files from UCLA Library uploaded by DaxBot. Thanks -- DaxServer (talk) 14:18, 9 January 2024 (UTC)
- Looks like UCLA metadata will need further refinement on your side. For example, File:Agastya asking Rama to deliver Dandakaranya from a curse from UCLA Library.png and File:Amirnama folio from UCLA Library.png are not photos. Also why UCLA license tag was used for work originated in non-USA territories (then)? Please use language ta for depatrment field. --EugeneZelenko (talk) 15:36, 9 January 2024 (UTC)
- I corrected all the uploads with respective PD licenses, dates, categories - the corrections are an output from the bot (as if it is run to upload). Thus, the bot is manually-assisted capable of corrections to the fields on-demand. There are some quirks here and there, but I expect to update the bot logic as they're encountered.
- I'm not sure what you meant by using language ta for department field. Could you explain?
- Apart from that, please review and let me know the feedback, and if I missed something. Thanks! -- DaxServer (talk) 19:58, 14 January 2024 (UTC)
- For example, {{en|UCLA Digital Library – AIIS Center for Art & Archaeology – Negatives & Slides Collection}}. Is it possible to use proper copyright tag based on part of collection like all photos are 2D reproductions of public domain art? --EugeneZelenko (talk) 16:55, 15 January 2024 (UTC)
- I used the {{PD-Art|PD-old-100}} inside the
{{Photograph}}
template and {{PD-old-100}}{{PD-US-expired}}{{PD-country}} for the original object{{Artwork}}
template. If I understand correctly what you meant, the PD tags highlighted in green would not be necessary - when the original art work/collection is in PD and the photo is its 2D repro. Did I get it right? -- DaxServer (talk) 20:36, 15 January 2024 (UTC)- {{PD-old-100}} should be enough for cases that it covers. --EugeneZelenko (talk) 16:06, 17 January 2024 (UTC)
- Alright. Can you confirm if this is good - File:18-handed Durga from UCLA Library.png DaxServerEverywhere (talk) 08:58, 18 January 2024 (UTC)
- Please use language tags for texts like presumed Nepal. --EugeneZelenko (talk) 15:21, 18 January 2024 (UTC)
- I've updated them. Thanks! -- DaxServer (talk) 18:00, 18 January 2024 (UTC)
- Please use language tags for texts like presumed Nepal. --EugeneZelenko (talk) 15:21, 18 January 2024 (UTC)
- Alright. Can you confirm if this is good - File:18-handed Durga from UCLA Library.png DaxServerEverywhere (talk) 08:58, 18 January 2024 (UTC)
- {{PD-old-100}} should be enough for cases that it covers. --EugeneZelenko (talk) 16:06, 17 January 2024 (UTC)
- I used the {{PD-Art|PD-old-100}} inside the
- For example, {{en|UCLA Digital Library – AIIS Center for Art & Archaeology – Negatives & Slides Collection}}. Is it possible to use proper copyright tag based on part of collection like all photos are 2D reproductions of public domain art? --EugeneZelenko (talk) 16:55, 15 January 2024 (UTC)
- Looks like UCLA metadata will need further refinement on your side. For example, File:Agastya asking Rama to deliver Dandakaranya from a curse from UCLA Library.png and File:Amirnama folio from UCLA Library.png are not photos. Also why UCLA license tag was used for work originated in non-USA territories (then)? Please use language ta for depatrment field. --EugeneZelenko (talk) 15:36, 9 January 2024 (UTC)
- @EugeneZelenko Test run is done with 25 images - Category:Files from UCLA Library uploaded by DaxBot. Thanks -- DaxServer (talk) 14:18, 9 January 2024 (UTC)
- Please advise: Have all issues been addressed? Is this ready to run? --Krd 14:36, 26 January 2024 (UTC)
- Yes, I've addressed all the issued raised by @EugeneZelenko. I'm awaiting his further feedback, if any -- DaxServer (talk) 18:59, 27 January 2024 (UTC)
- Could you please make another test run? --EugeneZelenko (talk) 16:26, 29 January 2024 (UTC)
- Sure, please hold. I'll try to see if I can do this Sunday DaxServerEverywhere (talk) 08:39, 31 January 2024 (UTC)
- Hi I did a test run with 15 files. Please let me know the feedback. -- DaxServer (talk) 17:48, 4 February 2024 (UTC)
- Mostly looks OK for me, but in File:An idyllic scene from UCLA Library.png {{PD-art}} is incorrectly applied to 3D work. --EugeneZelenko (talk) 15:25, 5 February 2024 (UTC)
- Ah, I see. Thanks for the note. I'll update them and in the bot-code. -- DaxServer (talk) 17:53, 5 February 2024 (UTC)
- I've updated it and 4 others to use
{{PD-author|author-goes-here}}
for the photos (transcluded by {{PD-UCLA-AIIS-CAA}} in this collection's case). I hope I got the license tag correct. -- DaxServer (talk) 18:21, 5 February 2024 (UTC)- Hi. Is there any update on the review? Thanks! -- DaxServer (talk) 20:31, 16 February 2024 (UTC)
- Is source metadata enough to determine if work 2D or 3D? Or manual list creation is needed to avoid mistakes in licensing? --EugeneZelenko (talk) 15:43, 17 February 2024 (UTC)
- I believe the metadata is enough. For example, this one has under the Contents Note that describes it as a terracotta material along with the tag "terracottas (sculptural works)" under Subject Keywords. While this one is described as a manuscript and the subject/image being a painting. If a list were to be used to assist, it seem to be a bit easier to create one - this repository has all 3D art, probably other repositories have their fields, I can investigate if needed. But in the end, the Subject Keywords [and the descriptions] in the UCLA website seem to laid out with care to make it enough to determine whether it's 2D or 3D. (Just clarifying,.. I wasn't earlier fully aware that PD-art cannot be used for the 3D, altho the bot was already aware that it is 3D (not exactly "3D" but "sculpture") as the categories seen in this file at upload time before your note. I just had to swap the licence when it's a 3D work to fix the issue.) -- DaxServer (talk) 18:15, 17 February 2024 (UTC)
- Is source metadata enough to determine if work 2D or 3D? Or manual list creation is needed to avoid mistakes in licensing? --EugeneZelenko (talk) 15:43, 17 February 2024 (UTC)
- Hi. Is there any update on the review? Thanks! -- DaxServer (talk) 20:31, 16 February 2024 (UTC)
- I've updated it and 4 others to use
- Ah, I see. Thanks for the note. I'll update them and in the bot-code. -- DaxServer (talk) 17:53, 5 February 2024 (UTC)
- Mostly looks OK for me, but in File:An idyllic scene from UCLA Library.png {{PD-art}} is incorrectly applied to 3D work. --EugeneZelenko (talk) 15:25, 5 February 2024 (UTC)
- Could you please make another test run? --EugeneZelenko (talk) 16:26, 29 January 2024 (UTC)
- Yes, I've addressed all the issued raised by @EugeneZelenko. I'm awaiting his further feedback, if any -- DaxServer (talk) 18:59, 27 January 2024 (UTC)
GeertivpBot (talk · contribs)[edit]
Operator: Geertivp (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)
Bot's tasks for which permission is being sought:
- Add missing SDC depict statements on media files (File namespace)
- Add missing Wikidata Infobox template to Category pages (Category namespace)
Automatic or manually assisted: Automatically, but monitored
Edit type (e.g. Continuous, daily, one time run): Intermittently
Maximum edit rate (e.g. edits per minute): 8 edits per minute
Bot flag requested: (Y/N): Y
Programming language(s): Pywikibot, Python scripts are on GitHub:
- https://github.com/geertivp/Pywikibot/blob/main/add_image_from_sdc.py
- https://github.com/geertivp/Pywikibot/blob/main/copy_label.py
Test runs are here.
Geert Van Pamel (talk) 22:29, 3 January 2024 (UTC)
- Discussion
- Could you please elaborate how `depicts` is filled? For example, File:Novosibirsk Regional Museum at night 2.jpg should depict building and condition of shoot (night shoot) should be qualifier. --EugeneZelenko (talk) 15:40, 4 January 2024 (UTC)
- The image depicts a "night view of the Royal museum", expressed as SDC depicts (P180) nighttime view (Q28333482) with qualifier of (P642) City Trade House (Q19908752), based upon and generated by the original Wikidata statement Q19908752#P3451 City Trade House (Q19908752) nighttime view (P3451) (M19171168). By doing so, both the SDC depict statement in Wikimedia Commons and the Wikidata statement are describing the same fact.
- What qualifier would you use instead? Can you please elaborate more about the exact statement that you would create? Thanks. Geert Van Pamel (talk) 16:23, 4 January 2024 (UTC)
- Maybe you would like to see: depicts (P180) City Trade House (Q19908752) with qualifier depicted format (P7984) nighttime view (Q28333482)? Please give your point of view/preferences. Geert Van Pamel (talk) 14:24, 5 January 2024 (UTC)
- It makes sense to have broader discussion on matter of qualifiers. May be bot should be limited just for subjects for now? --EugeneZelenko (talk) 15:35, 5 January 2024 (UTC)
- @Geertivp: ? --Krd 14:34, 26 January 2024 (UTC)
- Or we might generate two statements without qualifiers:
- depicts (P180) City Trade House (Q19908752)
- depicted format (P7984) nighttime view (Q28333482) Geert Van Pamel (talk) 19:43, 26 January 2024 (UTC)
- @Geertivp: ? --Krd 14:34, 26 January 2024 (UTC)
- But here we have the problem that depicted format (P7984) may not be used as a qualifier: d:Property:P7984#P2302 => property scope constraint (Q53869507) as main value (Q54828448). In addition to that it can only be used with work of art (Q838948) entities and requires item-requires-statement constraint (Q21503247) genre (P136), which is in general not the case in this suggested usage. Which other qualifier property could be used instead? Geert Van Pamel (talk) 18:51, 18 February 2024 (UTC)
- It makes sense to have broader discussion on matter of qualifiers. May be bot should be limited just for subjects for now? --EugeneZelenko (talk) 15:35, 5 January 2024 (UTC)
- Maybe you would like to see: depicts (P180) City Trade House (Q19908752) with qualifier depicted format (P7984) nighttime view (Q28333482)? Please give your point of view/preferences. Geert Van Pamel (talk) 14:24, 5 January 2024 (UTC)
FlickypediaBackfillrBot (talk · contribs)[edit]
Operator: Alexwlchan (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information) , working for the Flickr Foundation
Bot's tasks for which permission is being sought:
- Improving structured data for Flickr photos which have been uploaded to Wikimedia Commons, e.g. adding creator, license metadata.
- Adding the new Flickr photo ID (P12120) property to all files, to make it easier for other tools to work with Flickr photos
Automatic or manually assisted: unsupervised
Edit type (e.g. Continuous, daily, one time run): manually triggered
Maximum edit rate (e.g. edits per minute): tbc, probably 5–10 edits per second
Bot flag requested: (Y/N): Y
Programming language(s): Python
- Discussion
- Please don't make manual edits with the bot account. Please make few test edits. --Krd 14:31, 1 November 2023 (UTC)
- Interesting proposal. I made one edit to Commons:Flickypedia/Data Modeling, otherwise this looks good. Curious how you will handle conflicting existing SDC claims? --Schlurcher (talk) 17:53, 1 November 2023 (UTC)
- Good question! My general approach with these things is to be extremely conservative – imo the V1 bot should be purely additive, and any conflicts should be flagged for manual inspection.
- Then a couple of things might happen:
- The existing SDC looks wrong, so I make a manual edit from my account to fix it. e.g. I’ve already been looking at the use of source of file (P7482) for Flickr photos in the SDC snapshots, and I found ~200 cases where the URL points to the Flickr URL’s profile (
/photos/{username}
) rather than the photo itself (/photos/{username}/{photo_id}
). Those got dropped on a queue and I’ve been gradually tidying them up by hand – opening the files in question and making a manual edit from my account to point to the more specific URL. - The existing SDC looks right, so I work out why the bot is disagreeing. Is it a bug in my code, have I interpreted the data mapping wrong, is the data mapping at odds with the community approach to SDC, is the bot missing some bit of info on the Flickr photo. But the bot won't do anything on its own.
- The existing SDC looks wrong, so I make a manual edit from my account to fix it. e.g. I’ve already been looking at the use of source of file (P7482) for Flickr photos in the SDC snapshots, and I found ~200 cases where the URL points to the Flickr URL’s profile (
- There might also be cases where the existing SDC is wrong in large numbers and we'd want to write an automated fix, but that's somewhat risky and I’d want to be extremely careful before doing that. Two possible examples spring to mind:
- License versions. Flickr photos use CC 2.0 licenses, so that's what the bot will write into the SDC. But what if it finds a Wiki Commons file which links to the 4.0 version of the CC license? That sounds like an easy candidate for a fix buuuut I think there are Flickr users who leave descriptions on their photos saying "I license this as CC 4.0". A human copying their photo across would notice that; the bot might not. So in this case the bot would likely leave it as-is to avoid deleting info.
- Date granularity. Flickr has different levels of granularity for "date taken". Most photos are DDMMYY, but there are some which are MMYY or YY or "Circa YY". If there are lots of cases where there's an imprecise data but the SDC claims it's a full DDMMYY, we might consider automating that. (It's pretty obvious when this has happened – Flickr always returns a full timestamp from its API, but it sets all the unknown values to 0/1. So a YYYY becomes
taken="1950-01-01 00:00:00" takengranularity="6"
.) The bot could be written to fix these. But I don't know if that's a widespread issue in practice.
- If/when the bot does start editing existing SDC claims, I'll make sure we document those with examples – and if there are cases that seem contentious, I'll bring them back for community discussion before actually implementing them. Alexwlchan (talk) 08:13, 2 November 2023 (UTC)
- To return to this question of "how does the bot handle conflicting edits":
- Right now the bot will flag any conflicts as "unknown", not make any edits, and put them in a manual queue for review. I’ll look at them and decide if we need to update the bot code, do a manual edit to the SDC, or leave it be.
- Example: license has changed since upload to WMC
- I just ran it against File:MINDANAO BLEEDING-HEART DOVE (6939195884).jpg.
- The file on Commons has an existing copyright license (P275) = CC BY 2.0 statement
- The photo on Flickr currently has a CC BY-SA 2.0 license
- This confuses the bot, because it wants to write a different SDC statement to what’s currently in Commons – so it flags it as “unknown”.
- I went and had a look at it, and I can see that the license has changed since the initial upload – there’s a license history feature on Flickr, and it was changed from CC BY 2.0 in April 2014, a year after it was uploaded to Commons.
- (And now I'm going to look at tweaking the bot code so it gets the license from when the photo was uploaded to Commons, and uses that rather than whatever the license is now. But license is a pretty well-populated field, so I may not need this in practice.) Alexwlchan (talk) 08:22, 13 December 2023 (UTC)
- Brief addendum to this: I’m going to take license out of the bot for now.
- 1. Licenses are already pretty well-populated in SDC, so the potential gain here is less.
- 2. I’m encountering a lot of cases where Flickr users have changed their license after the fact, which makes the bot unhappy.
- It is possible to see license history on Flickr as far back as 2008, or I could inspect the Wikitext, but I’m going to leave it for now. I can come back later and see how many Flickr photos are actually missing a license in practice. Alexwlchan (talk) 14:45, 13 December 2023 (UTC)
- To add another example to this:
- If the bot encounters conflicting information in the "date taken" field, it flags a warning but doesn’t do anything.
- e.g. File:STS059-238-074 Strait of Gibraltar.jpg is a photo which was posted to both Flickr and a NASA website. On Flickr the taken date is "April 1994", but on NASA's website we get the more precise date "17 April 1994", which is what's used in the SDC.
- Flickypedia would write a statement "April 1994" if it was copying the photo fresh from Flickr, but it doesn't overwrite the existing, more-precise statement when it does the backfill. Alexwlchan (talk) 11:02, 15 December 2023 (UTC)
- Interesting proposal. I made one edit to Commons:Flickypedia/Data Modeling, otherwise this looks good. Curious how you will handle conflicting existing SDC claims? --Schlurcher (talk) 17:53, 1 November 2023 (UTC)
- 👍 I’ll probably get to making some test edits early next week, and I’ll link them here for inspection when they’re done. Alexwlchan (talk) 07:46, 2 November 2023 (UTC)
- I know it’s been a couple of weeks and nothing has happened on this.
- I am planning to get back to this bot eventually, but right now I’m prioritising getting the “uploader” part of Flickypedia working. Once that’s done, I’ll come back to the Backfillr bot. Alexwlchan (talk) 09:47, 23 November 2023 (UTC)
- I left some comments about the data model at Commons_talk:Flickypedia/Data_Modeling#Some_feedback_based_on_User:GeographBot. Where can we find the source code? The bot I mentioned is at https://github.com/multichill/toollabs/blob/master/bot/commons/geograph_uploader.py . Multichill (talk) 20:52, 16 November 2023 (UTC)
- Thanks for your feedback on the model; I’ll address that there.
- The source code isn’t public yet, but by the time we run the bot properly (and probably before I have time for test edits) it’ll be available here: https://github.com/Flickr-Foundation/flickypedia Alexwlchan (talk) 09:49, 23 November 2023 (UTC)
- I Oppose this bot request until two concerns are addressed:
- Source code should be published.
- Logic for looking up Wikidata items based on Flickr user ID (P3267) is removed (details on Commons talk:Flickypedia/Data Modeling)
- Multichill (talk) 19:50, 18 December 2023 (UTC)
- 1. Fixed! Sorry, this slipped my mind – I had a README and documentation all sorted so I could make it public, but never actually hit the button. 😅
- 2. I’ll reply to that on the other thread. Alexwlchan (talk) 09:04, 19 December 2023 (UTC)
- @Alexwlchan: I think both issues have been addressed now, right? If so, I see no objection to run this bot. Multichill (talk) 20:22, 6 February 2024 (UTC)
- Yup, both issues addressed! Thanks for your feedback. Alexwlchan (talk) 13:58, 26 February 2024 (UTC)
- @Alexwlchan: I think both issues have been addressed now, right? If so, I see no objection to run this bot. Multichill (talk) 20:22, 6 February 2024 (UTC)
- I Oppose this bot request until two concerns are addressed:
- Test edits are done! You can see some examples of the bot's changes here:
- File:Neasden Temple - Shree Swaminarayan Hindu Mandir - Power Plant.jpg
- File:Traditional vessel (Stone Town).jpg
- File:TimesSquare-500px.jpg
- File:Rfid implant after.jpg
- File:Bryn Athyn Cathedral - Pennsylvania (4825981267).jpg Alexwlchan (talk) 08:08, 13 December 2023 (UTC)
- Thanks for doing the test edits. Content looks good. I only have technical comments.
- Please combine these four edits into one: [2]
- If you use a JSON data specification this can be done by simply merging all the different claims.
- Please tag the edits with "BotSDC" as lots of user use this tag to filter out SDC edits
- If you use a JSON post request this can be done by adding
{ "tags", "BotSDC" }
- Please make sure you specify a maxlag for your edits as this got me into trouble once and avoid database overload
- If you use a JSON post request this can be done by adding
{ "maxlag", "2" }
- In the edit summary, please link the phrase structured data to
[[Commons:Structured data|structured data]]
or this bot request so users can find out more if needed.
- I would appreciate if you could perform another set of bot edits that incorporate this. --Schlurcher (talk) 08:46, 13 December 2023 (UTC)
- Thanks for the quick feedback! I’ve addressed all four of your suggestions.
- 1. Done, duh. For some reason I got it into my head that you can’t modify multiple properties at once, but I think that’s just a limitation of the visual editor? API seems fine with it, so that’s changed.
- 2. Done.
- 3. Done. I’m also planning to drop a note to somebody who works on the structured data team before I start running the bot at large scale, as a courtesy – backfilling Flickr data means 10s of millions of new statements, and I figure it’ll be easier if they have a direct line to the person adding database load.
- 4. Done. I’ve also added the property IDs, which I figured might be useful.
- Some more test edits:
- https://commons.wikimedia.org/w/index.php?title=File:Bolle_med_rygeost_(5080938528).jpg&diff=prev&oldid=830338946
- https://commons.wikimedia.org/w/index.php?title=File:%C5%81ukasz_Simlat_2013.jpg&diff=prev&oldid=830339027
- https://commons.wikimedia.org/w/index.php?title=File:2x_F-16CJ_91-352_%26_91-366_(6843244831).jpg&diff=prev&oldid=830339160
- https://commons.wikimedia.org/w/index.php?title=File:Restaurant_Kiin_Kiin_Sukkerr%C3%B8r_med_lime_(6200226667).jpg&diff=prev&oldid=830339211
- Alexwlchan (talk) 12:39, 13 December 2023 (UTC)
- Thanks. No further comments from my end. My database issue was described here [3] and as I learned, as long as we respect maxlag it should be fine. As I've myself added 100s of millions of statements, I would not be too concerned about this request. Contrary, I think it is an excellent addition to improving SDC use. --Schlurcher (talk) 13:27, 13 December 2023 (UTC)
- Thanks for doing the test edits. Content looks good. I only have technical comments.
Please summarize: Have all issues been addressed? --Krd 04:28, 21 December 2023 (UTC)
- @Alexwlchan: ? --Krd 13:57, 30 December 2023 (UTC)
- Hi @Krd – sorry for the delay, I took a couple of weeks break from working on that. Getting back into it now, hope to wrap my head around what’s still needed soon! Alexwlchan (talk) 15:37, 22 January 2024 (UTC)
- @Krd I believe all the issues have been addressed now. :D Alexwlchan (talk) 13:58, 26 February 2024 (UTC)
- Hi @Krd – sorry for the delay, I took a couple of weeks break from working on that. Getting back into it now, hope to wrap my head around what’s still needed soon! Alexwlchan (talk) 15:37, 22 January 2024 (UTC)
@Alexwlchan: I just looked at your test edits. Edit summary like "Update the P12120, P1433, P170, P7482 properties in the ..." is quite cryptic. Either change it to something human readable, or use magic links like "Update the Flickr photo ID (P12120), published in (P1433), creator (P170), source of file (P7482) properties in the ...", you can do that by using d:Special:EntityPage/P170. Does make the summary very (too) long. Multichill (talk) 20:29, 6 February 2024 (UTC)
- From your example File:Neasden Temple - Shree Swaminarayan Hindu Mandir - Power Plant.jpg, you are not importing Flickr tags to P2572. Why not? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:56, 11 February 2024 (UTC)
- Where to put Flickr tags was discussed over here: Commons talk:Flickypedia/Data Modeling#Two folksonomies in Commons
- I didn't feel see a consensus that we should definitely put Flickr tags in P2572; if that changes later we can always change the Backfillr mapping and re-run it. Alexwlchan (talk) 10:39, 26 February 2024 (UTC)
- Thanks, I’ll change it to "Update SDC based on metadata from Flickr"! Alexwlchan (talk) 10:35, 26 February 2024 (UTC)
- I think I’ve addressed all the feedback, as well as can be expected. Please let me know if there’s anything else I need to do to get the bot approved! Alexwlchan (talk) 14:51, 27 February 2024 (UTC)
GlaMainBot (talk · contribs)[edit]
Operator: Beao (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)
Bot's tasks for which permission is being sought: I've automated lossless crops for Category:Images from the German Federal Archive with borders and need permission to start uploading the results.
Automatic or manually assisted: Manually assisted to start.
Edit type (e.g. Continuous, daily, one time run): One time run for uploads, otherwise daily for my listing of User:GlaMainBot/Most_used_images_for_cleanup
Maximum edit rate (e.g. edits per minute): At most ten uploads per minute.
Bot flag requested: (Y/N): Y
Programming language(s): TypeScript (using mwn)
Beao 07:49, 23 September 2023 (UTC)
- Discussion
- Just to clarify: Is this to upload as a new version, or to overwrite? If the latter, is there a consensus to do so? I see that those borders include photo credits to the individual photographers, and these are from a respected archive, so I'd just want to make sure that there is agreement that this is desired; I've seen similar situations go either way. Clearly more useful in Wikipedia articles without the borders, but it's not clear to me that we don't want also to host a version with the credit line on the image. - Jmabel ! talk 23:58, 1 October 2023 (UTC)
- My thought is to overwrite. I've not seen any written consensus on the matter, but in practice that's what has been done for years in this category. I think that implies a silent consensus, considering these captions have been digitally added by the archive and provide no additional information not already in the description. Beao (talk) 08:36, 2 October 2023 (UTC)
- Please make some example edits. Krd 17:06, 6 October 2023 (UTC)
- All right, here are three examples:
- File:Bundesarchiv Bild 137-068842, Sonderzug der Einwandererzentralstelle.jpg
- File:Bundesarchiv Bild 137-068843, Sonderzug der Einwandererzentralstelle.jpg
- File:Bundesarchiv Bild 137-068852, Sonderzug der Einwandererzentralstelle.jpg Beao (talk) 10:55, 7 October 2023 (UTC)
- Looks good to me. Krd 13:49, 11 October 2023 (UTC)
- Please make some example edits. Krd 17:06, 6 October 2023 (UTC)
- Any more information or discussion needed? Beao (talk) 12:22, 14 October 2023 (UTC)
- [4] Why is this updated so often? Krd 03:04, 19 October 2023 (UTC)
- The "Images with watermarks" category is very big, so the retrieval of file usage statistics is batched to a fixed number of images every hour to avoid performance spikes, and I update the gallery after every batch. Is updating gallery pages too often problematic? I could do it less often (I'm thinking if images are not removed from the category), and also avoid doing it when nothing changes. Beao (talk) 15:53, 20 October 2023 (UTC)
- Please at least don't update when nothing significant changes. Krd 07:42, 21 October 2023 (UTC)
- I've updated the code to update only on changes now! Beao (talk) 09:01, 24 October 2023 (UTC)
- I appears to me that there are still too many edits or the statistics pages. (Or is there any relevant work done on these maintenance categories?) Krd 14:31, 10 November 2023 (UTC)
- @Beao: ? --Krd 05:41, 17 November 2023 (UTC)
- I've updated the code a couple of days ago and did some extra runs to confirm that it worked, and since then the non-changing categories haven't updated. But yeah, I'm also removing watermarks! Beao (talk) 07:27, 17 November 2023 (UTC)
- @Beao: ? --Krd 05:41, 17 November 2023 (UTC)
- I appears to me that there are still too many edits or the statistics pages. (Or is there any relevant work done on these maintenance categories?) Krd 14:31, 10 November 2023 (UTC)
- I've updated the code to update only on changes now! Beao (talk) 09:01, 24 October 2023 (UTC)
- Please at least don't update when nothing significant changes. Krd 07:42, 21 October 2023 (UTC)
- The "Images with watermarks" category is very big, so the retrieval of file usage statistics is batched to a fixed number of images every hour to avoid performance spikes, and I update the gallery after every batch. Is updating gallery pages too often problematic? I could do it less often (I'm thinking if images are not removed from the category), and also avoid doing it when nothing changes. Beao (talk) 15:53, 20 October 2023 (UTC)
- [4] Why is this updated so often? Krd 03:04, 19 October 2023 (UTC)
- Are Special:Diff/826638942 and Special:Diff/826649450 useful edits? --Krd 06:32, 1 December 2023 (UTC)
- @Beao: ? --Krd 13:53, 30 December 2023 (UTC)
- Not extremely useful, not completely useless. But okay, I will limit the updates more by rounding the stats to the nearest 100 or 1000. Beao (talk) 14:11, 30 December 2023 (UTC)
- @Beao: ? --Krd 13:53, 30 December 2023 (UTC)
- I think the usefulness of the edits haven't been shown at all. --Krd 09:10, 20 January 2024 (UTC)
- @Beao: Pls publish the source code so that we can see for any bugs.--Junior Jumper (formerly Tæ) 09:38, 20 February 2024 (UTC)