Benutzer Diskussion:Stefan Kühn/Check Wikipedia/Archiv/2009/August

aus Wikipedia, der freien Enzyklopädie
Letzter Kommentar: vor 14 Jahren von Stefan Kühn in Abschnitt aufgeteilte ausgabe-dateien
Zur Navigation springen Zur Suche springen

Report error on pt.wikipedia

Hello, there is an error on pt:Wikipedia:Projetos/Check Wikipedia, the report shows no errors despite they exist, is it a script or dump problem? Please respond at pt:Usuário Discussão:Alchimista. Alchimista 22:02, 1. Aug. 2009 (CEST)

The same for pl.wiki (Polish) - report is empty. PMG 19:03, 2. Aug. 2009 (CEST)
Hello, I have not change the script in the last two weeks. Maybe there was a toolserver-problem. -- sk 08:56, 4. Aug. 2009 (CEST)
Yes, there was a toolserver problem. The directory of the dumps was deleted. -- sk 21:52, 4. Aug. 2009 (CEST)

Erklärungstext für Priorität

Ich weiß nicht, ob das schonmal irgendjemand angesprochen hat: Ich würde es als sinnvoll sehen, wenn auf der Übersetzungsseite die Möglichkeit besteht Text zu definieren, der direkt nach den Prioritäten-Überschriften kommt. Dieses Feld sollte standardmäßig leer sein und kann lokal gepflegt werden. Man könnte dort Informationen zur Wichtigkeit der Fehler einfügen oder ähnliches. Der Umherirrende 12:16, 14. Aug. 2009 (CEST)

Hmm, Ich denke das jede Information beim Fehler direkt stehen sollte. Da ja sich auch mal durch Diskussionen die Priorität eines Fehlers ändern kann, wäre der Text sehr wartungsanfällig. Ich denke das bringt keine Vorteile. Bisher hat das auch noch keine Sprache gewünscht. -- sk 21:48, 18. Aug. 2009 (CEST)
Es ging mir hierbei auch eher um die allgemeine Erklärung von Fehlern. Beispielsweise etwas wie "Die nachfolgenden Fehler werden per Bot bearbeitet" oder "Die Fehler dieser Priorität sind nur kosmetischer Natur, suche noch weitere Verbesserungsmöglichkeiten im Artikel" etc. Der Umherirrende 18:32, 20. Aug. 2009 (CEST)

Vorschlag: Kategoriesortierung identisch Kategoriename

Hallo Stefan! Ich hatte es ja schon mal bei den Personendaten-Fehlern angesprochen, aber da es eigentlich alle Artikel betrifft (und hier der Archivbot nicht so schnell archiviert), melde ich es mal hier: Ab und zu findet man Kategorien wie [[Kategorie:Mann|Mann]], die daher herrühren, dass versehentlich ein senkrechter Strich eingefügt wurde und die Software dann den Kategorienamen ergänzt hat. Bei den Männern betrifft es zur Zeit 13 Stück, im Dump vom Januar habe ich immerhin 463 Zeilen gefunden, die auf \[\[Kategorie:([^|]*)\|\1\]\] passen. --Schnark 09:34, 20. Aug. 2009 (CEST)

Klasse Idee. Ich werde das mal ausprobieren. -- sk 13:25, 22. Aug. 2009 (CEST)

[tt_news]

moin. kannst du noch nach den texten [tt_news] und [backPid] suchen? die sollten eigentlich immer durch [tt_news] und [backPid] maskiert sein (siehe wiki-quelltext). viele wissen das nicht und so landet das oft in artikeln (diff, diff) --xAwOc 05:06, 26. Aug. 2009 (CEST)

Kannst du mir erklären wo diese Dinger herkommen? Hör das zum ersten Mal. Hast du noch Hintergrundinfos? Vielleicht ein Wikipedia-Artikel, der mir das Zeug erklärt. Danke. -- sk 10:23, 26. Aug. 2009 (CEST)
Typo3 benutzt das wohl. taucht jedenfalls häufiger mal in verschiedenen urls auf --xAwOc 15:24, 26. Aug. 2009 (CEST)

Database dump on ptwiki

If it's possible, could you run a full scan of the next database dump on pt.wiki? Thanks! Rjclaudio 23:04, 26. Aug. 2009 (CEST)

It is running. -- sk 16:42, 27. Aug. 2009 (CEST)
just to know :
When, theoretically, ask a fullscan ?
What are the symptoms, reasons, times to do it ? --Archimëa 17:52, 27. Aug. 2009 (CEST)
The biggest reason: too much text! Symptoms: need many time. The last full scan of a dump from frwiki need 900 minutes (15 hours) for 3.330.035 pages. This is 3700 pages/min or 61 pages/second. I think this is fast. But if I calculate this for enwiki with 17.809.799 I will need 4813 minutes or 80 hours! Also I am not alone at the toolserver. ptwiki has only 1.957.261 this is not so big. Maybe 8 hours. The long time is the reason for the stop of automatic fullscan. I think it is ok if we scan a dump all 2 or 3 months. After a scan we have many to do. Also every day I check the new articles and last changes, so many changes will included without fullscan. -- sk 19:12, 27. Aug. 2009 (CEST)
Sorry, I think I miss understand your question. When ask a fullscan? Answer: When there is no error in the list! :-) Sometime someone will fix only one error and need new data. This can be a reason. -- sk 22:39, 27. Aug. 2009 (CEST)
Interesting, by the way... Thx --Archimëa 07:37, 28. Aug. 2009 (CEST)

Database dump?

Hi- if it's possible with the coding, could you run a full scan of the next database dump on en.wiki? My bot's been working on some of the articles, but the script hasn't been really detecting it (sometimes I work from the bottom of a list, sometimes from the top, but it doesn't seem to matter either way). Thanks! -Drilnoth (Talk) 17:58, 17. Aug. 2009 (CEST)

Hello Drilnoth, yes it is possible. At the weekend I will start for you a dumpscan of enwiki. The problem is my scan need for enwiki many time. Because it scan also all templates for the TemplateTiger. This need time. And en has now more then 3 Million articles. At them moment I have stopped the automatic dumpscanning of all languages. If a language need a need dumpscan, than please tell this here. I work at the script so that the scan will be faster in the future, but also all languages are growing up. :-) -- sk 21:56, 18. Aug. 2009 (CEST)
Awesome; thank you. I understand that it is tough for the script, but just having a new dump scanned every few months is very helpful. Thanks! -Drilnoth (Talk) 17:56, 21. Aug. 2009 (CEST)
There was a problem with the toolserver. A blackout or so. I will try it next weekend again. -- sk 16:43, 27. Aug. 2009 (CEST)
Okay; thanks for the update. -Drilnoth (Talk) 22:26, 29. Aug. 2009 (CEST)

Template Navbox directly in articles

The script detects some templates written directly in articles (with error 085, detecting noiclude or includeonly). Some users use any of them (deleting name and others parameters...) Exemple here with this navbox (I've got others examples if needed, i spent 2 hours on this to clean some of this, two or three days ago). A template navbox must be in "Template spacename". Possible to detect if this templates are on articles ?

For articles : en:Template:Navbox, fr:Modèle:Méta palette de navigation

I saw in your todo list "error 69 - no detect "ISBN-10:", "ISBN-13:", "(ISBN-10)", "(ISBN-13)" most before or after a ISBN"
MediaWiki transform ISBN XXXXXXXX in a wikilink (i suppose mediawiki does it in de Wiki...). And so ISBN-13 or ISBN-10 don't do that... detect this allow to correct it. I don't understand why you want to don't detect this ?!
ISBN 1212541254
ISBN-10 1212541254 <- just see the difference
Detecte and clean this is a good thing, from my point of view
The probem is that the code detect it even if the is no number after (ISBN-XX without number is then used as Title for example, for table...)... If this have already talk, just forgot it -- - Archimëa ⇔ 21:30, 30. Aug. 2009 (CEST)
No I don't forgot this. But at the moment I have no time. Currently I work on a new interface for Check Wikipedia. So I stop the working at all other things. I hope I can finish this interface in the next days. It will help all languages more easy to work with Check Wikipedia. -- sk 14:04, 31. Aug. 2009 (CEST)
Ok. I will bump this thread in some time, so... if needed...
And i can't wait to see the new interface !! -- - Archimëa ⇔ 23:01, 31. Aug. 2009 (CEST)

frwiki output errors

We have also some problems on french project...

Check the output

Somes tables contains many [[:]] and appear even if there is no errors...

Some of errors have an abnormal big slimdown this morning. I was working yesterday with some lists like : "List of all articles with error 067", "List of all articles with error 050", "List of all articles with error 018" of that i'm sure there are still thousands. --Archimëa 12:16, 10. Aug. 2009 (CEST)

I have change the script and I think this could be the reason for this problem. I will check this tonight. -- sk 12:41, 10. Aug. 2009 (CEST)
Ok Thx --Archimëa 13:29, 10. Aug. 2009 (CEST)
Ditto this; a lot of the numbers for en.wp went down dramatically. My bot's been doing some stuff over the past few days, but not this much! -Drilnoth (Talk) 17:49, 10. Aug. 2009 (CEST)
 Ok, I hope it will run. Yesterday, I have fix a bug inside the script. The problem was article with "+" like A+ or C++. My script couldn't scan this article. I have fix this and it work very well. But than an other problem was created. The problem was title with "&" and '. Like Command & Conquer or Detroit Bright's Goodyears. I hope I have fix this problem. Now I have start a new run of all languages. Please tell me if something is wrong. Thanks. -- sk 22:21, 10. Aug. 2009 (CEST)
Ah, that makes sense. Thanks! -Drilnoth (Talk) 22:40, 11. Aug. 2009 (CEST)
about this problem, all is fine but error 50 on fr : only 11 are detected... there are still 1500 errors. 1500 can't disappear... --Archimëa 21:25, 12. Aug. 2009 (CEST)
error 38 is wrong also (sorry !)--Archimëa 16:47, 13. Aug. 2009 (CEST)
I checked all the list, error 18 is wrong also, it's the last one... --Archimëa 12:13, 14. Aug. 2009 (CEST)
Thanks for this feedback. I will check this. -- sk 21:20, 13. Aug. 2009 (CEST)

Ok, error 018 had an error. I have fixed this. Error 038 is working very well. And the small list of error 38 and 50 is a problem of this one run with this empty list. With the next dumpscan this will be ok. -- sk 22:06, 18. Aug. 2009 (CEST)

All is fine, now. Even others errors detection have detected some articles the script was missing, here or there...--Archimëa 14:23, 21. Aug. 2009 (CEST)
I bump this, look in the table in this archived version where 257 errors for 038 disapear -- - Archimëa ⇔ 11:17, 6. Sep. 2009 (CEST)

eo.wiki

Hi, I'm a user in pt.wiki (and sometimes, in eo.wiki, too). There's a broken link for the Check Wikipedia page for eo.wiki:

Instead of:

Please copy and paste that page at the toolserver to this page here.

It should be:

Please copy and paste that page at the toolserver to this page here.

One more thing: the error 006 (DEFAULTSORT with special letters) brings a false positive for eo.wiki, because they use DEFAŬLTORDIGO for DEFAULTSORT. Is it possible to avoid that? Is there something I can do?

Thanks in advance. Castelobranco 06:27, 5. Aug. 2009 (CEST)

The last one (I think!): the page at the toolserver brings some special characters, like Stefan Kühn, etc. Castelobranco 02:35, 6. Aug. 2009 (CEST)
Don't use the txt! Use the HTML-File. -- sk 08:46, 6. Aug. 2009 (CEST)
Is this HTML file available for all wikis? --Superyetkin 11:22, 6. Aug. 2009 (CEST)
Yes. It is. (Sorry, I fix the link to ~sk/checkwiki/eowiki/eowiki_output_for_wikipedia.html. -- sk 11:38, 6. Aug. 2009 (CEST)

Sorry, I didn't see that it was a link to a txt file. It's fine now, and I'm translating the page. Thanks! Castelobranco 16:08, 6. Aug. 2009 (CEST)

Dieser Abschnitt kann archiviert werden. sk 21:01, 4. Okt. 2009 (CEST)

Spanish Wikipedia

In es:Wikiproyecto:Check Wikipedia, section 4.6 'DEFAULTSORT is missing and title with lowercase_letters' there is no table. The output is given as a simple list of links. Sabbut 11:02, 10. Aug. 2009 (CEST)

This is normal. :-) -- sk 21:38, 18. Aug. 2009 (CEST)
Dieser Abschnitt kann archiviert werden. sk 21:01, 4. Okt. 2009 (CEST)

verschiebung auf id:

moin stefan. auf id: wurde id:Wikipedia:WikiProjekt Check Wikipedia nach id:Wikipedia:ProyekWiki Cek Wikipedia verschoben (und konsequenterweise auch id:Wikipedia:WikiProjekt Check Wikipedia/Translation nach id:Wikipedia:ProyekWiki Cek Wikipedia/Terjemahan). das müsste wohl noch gefixed werden; oder muss man für eine funktionierende übersetzung noch mehr beachten?
ausserdem wurde fy:Wikipedy:WikiProject Check Wikipedia nach fy:Meidogger:Stefan Kühn/WikiProject Check Wikipedia verschoben--AwOc 15:58, 12. Aug. 2009 (CEST)

Danke für die Info. Ich werde das mal am Wochenende updaten. -- sk 21:21, 13. Aug. 2009 (CEST)
 Ok, eingebaut. -- sk 21:59, 18. Aug. 2009 (CEST)
Dieser Abschnitt kann archiviert werden. sk 21:01, 4. Okt. 2009 (CEST)

Kleingeschriebener Kategoriename

Hallo Stefan, mir ist aufgefallen, dass dieser Fehler bei der letzten Kategorie in einem Artikel nicht anschlägt. Z.B. war in 15 Sagittae B ein Kategorie:doppelstern seit August 2008 unbemerkt. Erst nachdem am 3. August 2009 noch eine Kategorie dazukam, wurde der Fehler jetzt bemerkt. Andim 10:24, 13. Aug. 2009 (CEST)

Da muss ich nochmal schauen, wo mir da die Kategorie verloren geht. Danke für den Tipp. -- sk 21:22, 13. Aug. 2009 (CEST)
 Ok, Hab den Fehler gefunden. War ein Copy and Past fehler meinerseits. Danke nochmal! -- sk 21:45, 18. Aug. 2009 (CEST)
Dieser Abschnitt kann archiviert werden. sk 21:02, 4. Okt. 2009 (CEST)

aufgeteilte ausgabe-dateien

moin. kannst du drei zusätzliche ausgabe-dateien hinzufügen, jeweils für ein fehlerniveau. in der chinesischen variante teilen sie es so auf weil es am stück wegen des utf8-overheads zu groß wird. auch bei anderen projekten könnte es sinnvoll sein es so aufzuteilen. wenn man es nicht erst händisch auseinanderpflücken müsste wäre das hilfreich --xAwOc 16:38, 21. Aug. 2009 (CEST)

Ich arbeite an einer Umsetzung als dynamische Webseite, dort wird das kein Problem sein. Ich hoffe, die chinesische Wikipedia kommt solange noch mit dem alten Modus aus. Es könnte noch ein paar Wochen dauern, aber die ersten Test sehen schon ganz gut aus. -- sk 10:19, 26. Aug. 2009 (CEST)
Hallo AwOc, das neue Interface ist schon recht weit. Ich hoffe das hilft der chinesischen Wikipedia schonmal weiter, obwohl noch alles in englisch ist. -- sk 07:56, 1. Sep. 2009 (CEST)
Dieser Abschnitt kann archiviert werden. sk 21:02, 4. Okt. 2009 (CEST)

Hilf argumenten

Some french wikipedia users doubt about the need of some errors detections.

Can you explain me how to argue, (quicly, of course...) the goal of errors like error 002 (better reading of the wiki log ?), 018 and 022 (better reading of the wiki log ?), 057, and some others like html entities. In fact, errors bringing no real changes on articles. --Archimëa 21:50, 25. Aug. 2009 (CEST)

  • 002 the XTHML-Standard is <br />. All other is wrong. But you can write also <br>, this will my script not found. But all other different spelling like <br\> <\br> </br> will be found.
  • 018 The Standard in Wikipedia is "category:Island" you can also write "category:island" but it is not standard. It would be nice if all have a big first letter. Why? If you search in the dump a article with this category, you must only search for one ("category:Island"). In german Wikipedia we have not more then 20 of this problems. This is easy. Ok you in frwiki have over 30000. Maybe you will not fix this, than it is ok. Than you can deacitvated this error.
  • 022 It is better for searching in the dump. I think this is no problem in frwiki. I see only one in frwiki.
  • 057 A normal headline will not end with ":" This is not good style.

I hope this help. -- sk 10:33, 26. Aug. 2009 (CEST)

Thx for answering... --Archimëa 17:49, 27. Aug. 2009 (CEST)
Dieser Abschnitt kann archiviert werden. sk 21:02, 4. Okt. 2009 (CEST)

Suggests from France: References in headline

French conventions about this point say to insert <ref>xxxxx</ref> in text article rather than headlines, nearby the words or sentences the reference is talking about. The problem have been suggested on french Project:CheckWiki discussion page and agreed for suggestion. Do you agree to add this detection (hoping this could be usefull for others projects...).

I don't understand the discussion in french. Please describe the problem with an example. Thanks. -- sk 09:04, 4. Aug. 2009 (CEST)
IE with this french article M63 -> Look here the first version -> A reference is in the first headline, it should be placed in a small introduction sentence rather than in this title... The reference should be at the end of the sentence or nearby the word you want to "support" with this reference. We'd like to detect this... --Archimëa 17:07, 4. Aug. 2009 (CEST)
Others examples can be find there [1] [2] [3] [4]. I hope it will be useful for you. Regards, 86.68.72.207 06:55, 14. Aug. 2009 (CEST)

Suggests from France: Exception list

The problem have been suggested on french Project:CheckWiki discussion page and agreed for suggestion. A list of article would be ignored by the script. A lot of fale-positive can only be fixed in this way. I.E. with error 058 (Titre de section en capitales / Headline ALL CAPS). Some headline MUST be all caps (i.e. : fr:Variable d'environnement, fr:James L. Jones, This headlines have to be all caps. No way). Or I.E. with error 037 (DEFAULTSORT nécessaire manquant / DEFAULTSORT missing for titles with special letters). (i.e. : fr:々, this kind of article does not need DEFAULTSORT).
This list would be maintained by each checkwiki country/users. Each exception list would be separated by errors (headline 1=error 001, headline 2=error 002, etc) in order to keep the list easy to maintain. How and when (before each result buiding process) read and use the exception list, I can't answer well that as I'm not abble to program.
What do you think of this ?

At the moment I work one a concept for a Whitelist. So every language can insert there article for excluding from the process of er error. -- sk 09:11, 4. Aug. 2009 (CEST)
Good news... --Archimëa 17:07, 4. Aug. 2009 (CEST)

Suggests from France: Thumb: "right" parameter useless

When using the parameter "thumb", "right" become useless and redundant. The use of this two parameter together is really common. Even if the image display is not affected, "right" needs to be deleted. I think this detection is useful to clean the display of pictures. What about this one ?

Good idea. But this stand allready at my To-do-list (see: thumbs with forced size)-- sk 09:01, 4. Aug. 2009 (CEST)
Ok. Good news. It will be, so... "right" was not explicitly written... and "thumbs with forced size" is a good dectection, it should be "upright"... --Archimëa 17:07, 4. Aug. 2009 (CEST)
Dieser Abschnitt kann archiviert werden. sk 21:38, 26. Okt. 2011 (CEST)