Benutzer:Srbauer/Pywikipedia/Doc

aus Wikipedia, der freien Enzyklopädie
Zur Navigation springen Zur Suche springen

Meine Notizen beim Versuch das Pywikipedia-Framework zu verstehen - Library-Funktionen werden derzeit (noch) nicht berücksichtigt.

Basis sind die Inline-Kommentare, die (hoffentlich) nach und nach durch eigene Kommentare ergänzt werden.

Konfiguration

config.py

Initialisierung der Umgebung durch globale Variablen.

user-config.py

Überschreibt die Variablen mit userspezifischen Einstellungen.

Basismodule zum Einloggen

family.py

Parent class for all wiki families
  • class Family

families\wikipedia_family.py

The Wikimedia family that is known as Wikipedia, the Free Encyclopedia
  • class Family(family.Family)

login.py

Script and class to log the robot in to a wiki account.
class LoginManager

mediawiki_messages.py

Allows access to the MediaWiki messages, that's the label texts of the MediaWiki software in the current language. These can be used in other bots.

userinterfaces\terminal_interface.py

  • class UI

userinterfaces\transliteration.py

wikipedia.py

Library to get and put pages on a MediaWiki.
  • class Page(object)
  • class ImagePage(Page)
  • class XmlPage(Page)
  • class GetAll(object)
  • class Throttle(object)
  • class Site(object)
  • + Exception-Classes

xmlreader.py

Each XmlEntry object represents a page, as read from an XML source.
The MediaWikiXmlHandler can be used for the XML given by Special:Export as well as for XML dumps.
  • class XmlEntry
  • class XmlHeaderEntry
  • class MediaWikiXmlHandler(xml.sax.handler.ContentHandler)
  • class XmlParserThread(threading.Thread)
  • class XmlDump(object)

Module, die lokal in einzelnen Funktionen oder Methoden in wikipedia.py eingebunden werden

catlib.py

cosmetic_changes.py

This module can do slight modifications to a wiki page source code such that the code looks cleaner. The changes are not supposed to change the look of the rendered wiki page.
WARNING: This module needs more testing!
  • class CosmeticChangesToolkit
  • class CosmeticChangesBot

date.py

watchlist.py

Allows access to the bot account's watchlist.

... und von diesen Modulen eingebunden

pagegenerators.py

... und noch die GUI

gui.py

A window with a unicode textfield where the user can e.g. edit the contents of an article.
  • class EditBoxWindow:
  • class ListBoxWindow:

Bots

basic.py (Botschablone)

This is not a complete bot; rather, it is a template from which simple bots can be made. Change workon to edit the contents of a Wikipedia page, save the result as mybot.py, and then just run:
>>> mybot.py
to have your change be done on all pages of the wiki. If that takes too long to work in one stroke, run:
>>> mybot.py Pagename
to do all pages starting at pagename.

casechecker.py

Script to enumerate all pages in the wikipedia and find all titles with mixed latin and cyrilic alphabets.
(überhaupt lauffähig?)
  • class CaseChecker(object)

catall.py

Add or change categories on a number of pages. Usage:
>>> catall.py name
goes through pages, starting at 'name'. Provides the categories on the page and asks whether to change them. If no starting name is provided, the bot starts at 'A'.
Options:
-onlynew : Only run on pages that do not yet have a category.

category.py

Scripts to manage categories.
>>> category.py action [-option]
where action can be one of these:
* add - mass-add a category to a list of pages
* remove - remove category tag from all pages in a category
* move - move all pages in a category to another category
* tidy - tidy up a category by moving its articles into subcategories
* tree - show a tree of subcategories of a given category
and option can be one of these:
* -person - sort persons by their last name (for action 'add')
* -rebuild - reset the database
* -from: - The category to move from (for the move option); Also, the category to remove from in the remove option
* -to: - The category to move to (for the move option)
*: NOTE: If the category names have spaces in them, surround the names with single quotes, i.e. -to:Us -from:'United States'
* -batch - Don't prompt to delete emptied categories (do it automatically).
* -summary: - Pick a custom edit summary for the bot.
* -inplace - Use this flag to change categories in place rather than rearranging them.
  • class CategoryDatabase
  • class CategoryMoveRobot
  • class CategoryRemoveRobot
  • class CategoryTidyRobot
  • class CategoryTreeRobot

clean_sandbox.py

This bot makes the cleaned of the page of tests.
  • class SandboxBot

commons_link.py

Include commons template in home wiki.
  • class CommonsLinkBot

copyright.py

This robot check copyright text in Google and Yahoo.
  • class CheckRobot

delete.py

This script can be used to delete pages en masse. Of course, you will need an admin account on the relevant wiki.
>>> delete.py [-category categoryName]
Command line options:
-category: Delete all pages in the given category.
-links: Delete all pages linked from a given page.
-ref: Delete all pages referring from a given page.
-always Don't prompt to delete pages, just do it.
-summary: Supply a custom edit summary.
  • class DeletionRobot

extract_wikilinks.py

Script to extract all wiki page names a certain HTML file points to in interwiki-link format (runs offline).

featured.py

>>> featured.py [-interactive] [-nocache] [-fromlang:xx,yy|-fromall]
-interactive  : ask before changing page
-nocache  : doesn't include /featured/cache file to remembers if the article already was verified.
-fromlang:xx,yy  : xx is your language and yy the language was verified or using -fromall to verified all languages.

followlive.py

Script to follow new articles on a wikipedia and flag them with a template or eventually blank them.
There must be A LOT of bugs ! Use with caution and verify what it is doing !
  • class PageHandler
  • class CleaningBot

image.py

This script can be used to change one image to another or remove an image entirely.
  • class ImageRobot

imageharvest.py

Bot for getting multiple images from an external site. It takes a URL as an argument and finds all images (and other files specified by the extensions in 'fileformats') that URL is referring to, asking whether to upload them. If further arguments are given, they are considered to be the text that is common to the descriptions.

imagetransfer.py

Script to copy images to Wikimedia Commons, or to another wiki.
Syntax:
>>> imagetransfer.py pagename [-interwiki] [-targetLang:xx] -targetFamily:yy]
Arguments:
-interwiki Look for images in pages found through interwiki links.
-tolang:xx Copy the image to the wiki in language xx
-tofamily:yy Copy the image to a wiki in the family yy
-file:zz Upload many files from textfile: Datei:Xx
Datei:Yy
class ImageTransferBot

inline_images.py

This bot goes over multiple pages of the home wiki, and looks for images that are linked inline (i.e., they are hosted from an external server and hotlinked, instead of using the wiki's upload function.
It is especially useful when you intend to disable the inline linking

feature.

This script understands various command-line arguments:
-start: used as -start:page_name, specifies that the robot should go alphabetically through all pages on the home wiki, starting at the named page.
-file: used as -file:file_name, read a list of pages to treat from the named textfile. Page titles should be enclosed in double-squared brackets.
-ref: used as -start:page_name, specifies that the robot should touch all pages referring to the named page.
-links: used as -links:page_name, specifies that the robot should touch all pages referred to from the named page.
-cat: used as -cat:category_name, specifies that the robot should touch all pages in the named category.
-redir specifies that the robot should touch redirect pages; otherwise, they will be skipped.
All other parameters will be regarded as a page title; in this case, the bot will only touch a single page.
  • class InlineImagesRobot

interwiki.py

Script to check language links for general pages. This works by downloading the page, and using existing translations plus hints from the command line to download the equivalent pages from other languages. All of such pages are downloaded as well and checked for interwiki links recursively until there are no more links that are encountered. A rationalization process then selects the right interwiki links, and if this is unambiguous, the interwiki links in the original page will be automatically updated and the modified page uploaded.
  • class LinkMustBeRemoved(wikipedia.Error)
  • class Global(object)
  • class Subject(object)
  • class InterwikiBot(object)

interwiki_graph.py

  • class GraphImpossible(Exception):
  • class GraphSavingThread(threading.Thread):
  • class GraphDrawer:

isbn.py

This bot goes over multiple pages of the home wiki, and converts all ISBN-10 codes to the ISBN-13 format.
  • class IsbnBot
  • class InvalidIsbnException(wikipedia.Error)
  • class ISBN13
  • class ISBN10

makecat.py

This bot takes as its argument (or, if no argument is given, asks for it), the name of a new or existing category. It will then try to find new articles for this category (pages linked to and from pages already in the category), asking the user which pages to include and which not.
Arguments:
-nodates automatically skip all pages that are years or dates (years only work AD, dates only for certain languages)
-forward only check pages linked from pages already in the category, not pages linking to them. Is less precise but quite a bit faster.
-exist only ask about pages that do actually exist; drop any titles of non-existing pages silently. If -forward is chosen, -exist is automatically implied.
-keepparent do not remove parent categories of the category to be worked on.
-all work on all pages (default: only main namespace)

movepages.py

Bot page moves to another title. Special Wikibooks-like pages.
Command-line arguments:
-cat Work on all pages which are in a specific category. Argument can also be given as "-cat:categoryname".
-ref Work on all pages that link to a certain page. Argument can also be given as "-ref:referredpagetitle".
-link Work on all pages that are linked from a certain page. Argument can also be given as "-link:linkingpagetitle".
-start Work on all pages on the home wiki, starting at the named page.
-prefix Automatic move pages in specific page with prefix name of the pages. Argument can also be given as "-prefix:Python/Pywikipediabot/".
-from -to The page to move from and the page to move to.
-new Work on the most recent new pages on the wiki.
-del Argument can be given also together with other arguments, its functionality is delete old page that was moved. For example: "movepages.py Helen_Keller -del".
  • class MovePagesWithPrefix
  • class MovePagesBot

nowcommons.py

  • class NowCommonsDeleteBot

pagefromfile.py

pagefromfile.py

This bot takes its input from a file that contains a number of pages to be put on the wiki. The pages should all have the same begin and end text (which may not overlap).

redirect.py

Script to resolve double redirects, and to delete broken redirects. Requires access to MediaWiki's maintenance pages or to a XML dump file. Delete function requires adminship.
>>> redirect.py action [-argument]
where action can be one of these:
  • double - fix redirects which point to other redirects
  • broken - delete redirects where targets don\'t exist. Requires adminship.
and argument can be:
  • xml - retrieve information from a local XML dump (http://download.wikimedia.org). Argument can also be given as "-xml:filename.xml". If this argument isn't given, info will be loaded from a special page of the live wiki.
  • namespace:n - Namespace to process. Works only with an XML dump. Currently not supported!
  • restart:n - Number of redirect to restart with (see progress). Works only with an XML dump. Currently not supported!
  • class RedirectGenerator
  • class RedirectRobot

refcheck.py

This script checks references to see if they are properly formatted. Right now it just counts the total number of transclusions of any number of given templates.
>>> refcheck.py command [arguments]
Command line options:
-count Counts the number of times each template (passed in as an argument) is transcluded.
-namespace: Filters the search to a given namespace. If this is specified multiple times it will search all given namespaces

replace.py

This bot will make direct text replacements. It will retrieve information on which pages might need changes either from an XML dump or a text file, or only change a single page.
You can run the bot with the following commandline parameters:
-xml - Retrieve information from a local XML dump (pages_current, see http://download.wikimedia.org). Argument can also be given as "-xml:filename".
-file - Work on all pages given in a local text file. Will read any wiki link and use these articles. Argument can also be given as "-file:filename".
-cat - Work on all pages which are in a specific category. Argument can also be given as "-cat:categoryname".
-page - Only edit a specific page. Argument can also be given as "-page:pagetitle". You can give this parameter multiple times to edit multiple pages.
-ref - Work on all pages that link to a certain page. Argument can also be given as "-ref:referredpagetitle".
-filelinks - Works on all pages that link to a certain image. Argument can also be given as "-filelinks:ImageName".
-links - Work on all pages that are linked to from a certain page. Argument can also be given as "-links:linkingpagetitle".
-start - Work on all pages in the wiki, starting at a given page. Choose "-start:!" to start at the beginning.
NOTE: You are advised to use -xml instead of this option; this is meant for cases where there is no recent XML dump.
-regex - Make replacements using regular expressions. If this argument isn't given, the bot will make simple text replacements.
-except:XYZ - Ignore pages which contain XYZ. If the -regex argument is given, XYZ will be regarded as a regular expression.
-summary:XYZ - Set the summary message text for the edit to XYZ, bypassing the predefined message texts with original and replacements inserted.
-fix:XYZ - Perform one of the predefined replacements tasks, which are given in the dictionary 'fixes' defined inside the file fixes.py. The -regex argument and given replacements will be ignored if you use -fix. Currently available predefined fixes are:
* HTML - convert HTML tags to wiki syntax, and fix XHTML
* syntax - try to fix bad wiki markup.
* case-de - fix upper/lower case errors in German
* grammar-de - fix grammar and typography in German
-namespace:n - Number of namespace to process. The parameter can be used multiple times. It works in combination with all other parameters, except for the -start parameter. If you e.g. want to iterate over all user pages starting at User:M, use -start:User:M.
-always - Don't prompt you for each replacement
other: - First argument is the old text, second argument is the new text. If the -regex argument is given, the first argument will be regarded as a regular expression, and the second argument might contain expressions like \\1 or \g<name>.
NOTE: Only use either -xml or -file or -page, but don't mix them.

saveHTML.py

This bot downloads the HTML-pages of articles and images and saves the interesting parts, i.e. the article-text and the footer to a file like Hauptseite.txt.
Options:
-o: Specifies the output-directory where to save the files
-images: Downlaod all images
-overwrite:[I|A|B] Ignore existing Images|Article|Both and download them even if the exist

selflink.py

This bot goes over multiple pages of the home wiki, searches for selflinks, and allows removing them.
This script understands various command-line arguments:
-start: used as -start:page_name, specifies that the robot should go alphabetically through all pages on the home wiki, starting at the named page.
-file: used as -file:file_name, read a list of pages to treat from the named textfile. Page titles should be enclosed in double-squared brackets.
-ref: used as -ref:page_name, specifies that the robot should work on all pages referring to the named page.
-links: used as -links:page_name, specifies that the robot should work on all pages referred to from the named page.
-cat: used as -cat:category_name, specifies that the robot should work on all pages in the named category.

solve_disambiguation.py

Script to help a human solve disambiguations by presenting a set of options. Command line options:
-pos:XXXX adds XXXX as an alternative disambiguation
-just only use the alternatives given on the command line, do not read the page for other possibilities
-primary "primary topic" disambiguation (Begriffsklärung nach Modell 2). That's titles where one topic is much more important, the disambiguation page is saved somewhere else, and the important topic gets the nice name.
-primary:XY like the above, but use XY as the only alternative, instead of searching for alternatives in Keyword (disambiguation). Note: this is the same as -primary -just -pos:XY
-file:XYZ reads a list of pages from a text file. XYZ is the name of the file from which the list is taken. If XYZ is not given, the user is asked for a filename. Page titles should be inside double brackets. The -pos parameter won't work if -file is used.
-always:XY instead of asking the user what to do, always perform the same action. For example, XY can be "r0", "u" or "2". Be careful with this option, and check the changes made by the bot. Note that some choices for XY don't make sense and will result in a loop, e.g. "l" or "m".
-main only check pages in the main namespace, not in the talk, wikipedia, user, etc. namespaces.
-start:XY goes through all disambiguation pages in the category on your wiki that is defined (to the bot) as the category containing disambiguation pages, starting at XY. If only '-start' or '-start:' is given, it starts at the beginning.

spellcheck.py

This bot spellchecks Wikipedia pages. It is very simple, only checking whether a word, stripped to its 'essence' is in the list or not, it does not do any grammar checking or such. It can be used in two ways:
options:
-start:Title: Go through the wiki, starting at title 'Title'.
-newpages: Go through the pages on Special:Newpages
-longpages: Go through the pages on Special:Longpages

Argument:

Title: Check a single page; after this the bot will ask whether you want to check another page

splitwarning.py

Splits a interwiki.log file into chunks of warnings separated by language (runs offline).

sqldump.py

Reads a cur SQL dump and offers a generator over SQLentry objects which can be used by other bots. Each SQLentry object represents a page (runs offline).

Can also be run directly from the command line to retrieve page lists from an SQL dump.

>>> sqldump.py -sql:filename.sql action
Where action can be one of these:
- find - List pages which contain a certain text
- findr - List pages containing text matching a regular expression
- shortpages - List pages with short contents
- unmountedcats - List categories that don't have a supercategory
- percentnames - List pages that contain internal links where special characters are encoded as hexadecimal codes, e.g. %F6
- baddisambiguations - Created for de.wikipedia to fix primary topic disambiguations (Begriffsklärung nach Modell 2).

standardize_interwiki.py

Loop over all pages in the home wiki, standardizing the interwiki links.

standardize_notes.py

This bot will standardize footnote references. It will retrieve information on which pages might need changes either from an SQL dump or a text file, or only change a single page.
At present it converts to Wikipedia:Footnote3 format (ref/note).
You can run the bot with the following commandline parameters:
-sql - Retrieve information from a local SQL dump (cur table, see http://download.wikimedia.org). Argument can also be given as "-sql:filename".
-file - Work on all pages given in a local text file. Will read any wiki link and use these articles. Argument can also be given as "-file:filename".
-cat - Work on all pages which are in a specific category. Argument can also be given as "-cat:categoryname".
-page - Only edit a single page. Argument can also be given as "-page:pagename". You can give this parameter multiple times to edit multiple pages.
-regex - Make replacements using regular expressions. (Obsolete; always True)
-except:XYZ - Ignore pages which contain XYZ. If the -regex argument is given, XYZ will be regarded as a regular expression.
-namespace:n - Namespace to process. Works only with a sql dump
-always - Don't prompt you for each replacement
other: - First argument is the old text, second argument is the new text. If the -regex argument is given, the first argument will be regarded as a regular expression, and the second argument might contain expressions like \\1 or \g<name>.
NOTE: Only use either -sql or -file or -page, but don't mix them.

table2wiki.py

Nifty script to convert HTML-tables to MediaWiki's own syntax.
-xml - Retrieve information from a local XML dump (pages_current, see http://download.wikimedia.org). Argument can also be given as "-xml:filename". Searches for pages with HTML tables, and tries to convert them on the live wiki.
-sql - Retrieve information from a local mirror. Searches for pages with HTML tables, and tries to convert them on the live wiki.
-start: - used as -start:pagename, specifies that the robot should go alphabetically through all pages on the home wiki, starting at the named page.
-file:filename - Will read any wikipedia link and use these articles

template.py

Very simple script to replace a template with another one, and to convert the old MediaWiki boilerplate format to the new template format.
>>> template.py [-remove] [xml[:filename]] oldTemplate [newTemplate]
Specify the template on the command line. The program will pick up the template page, and look for all pages using it. It will then automatically loop over them, and replace the template.
Command line options:
-remove Remove every occurence of the template from every article
-xml retrieve information from a local dump (http://download.wikimedia.org). if this argument isn\'t given, info will be loaded from the maintenance page of the live wiki. argument can also be given as "-xml:filename.xml".
-namespace: Only process templates in the given namespace number (may be used multiple times).
-summary: Lets you pick a custom edit summary. Use quotes if edit summary contains spaces.
-always Don't bother asking to confirm any of the changes, Just Do It.
-page: Only edit a specific page. You can use this argument multiple times to work on multiple pages. If the page title has spaces in it, enclose the entire page name in quotes.
-extras Specify this to signal that all parameters are templates that should either be substituted or removed. Allows you to input way more than just two. Not compatible with -xml (yet) Disables template replacement.
other: First argument is the old template name, second one is the new name. If only one argument is given, the bot resolves the template by putting its text directly into the article. This is done by changing Vorlage:... or Vorlage:... into {{subst:...}}

templatecount.py

This script will display the list of pages transcluding a given list of templates. It can also be used to simply count the number of pages (rather than listing each individually).
>>> templatecount.py command [arguments]
Command line options:
-count Counts the number of times each template (passed in as an argument) is transcluded.
-list Gives the list of all of the pages transcluding the templates (rather than just counting them).
-namespace: Filters the search to a given namespace. If this is specified multiple times it will search all given namespaces

touch.py

This bot goes over multiple pages of the home wiki, and edits them without changing. This is for example used to get category links in templates working.
This script understands various command-line arguments:
-start: used as -start:page_name, specifies that the robot should go alphabetically through all pages on the home wiki, starting at the named page.
-file: used as -file:file_name, read a list of pages to treat from the named textfile. Page titles should be enclosed in double-squared brackets.
-ref: used as -ref:page_name, specifies that the robot should touch all pages referring to the named page.
-links: used as -links:page_name, specifies that the robot should touch all pages referred to from the named page.
-cat: used as -cat:category_name, specifies that the robot should touch all pages in the named category.
-redir specifies that the robot should touch redirect pages; otherwise, they will be skipped.

upload.py

Script to upload images to wikipedia.
Arguments:
-keep Keep the filename as is
-noverify Do not ask for verification of the upload description if one is given
If any other arguments are given, the first is the URL or filename to upload, and the rest is a proposed description to go with the upload. If none of these are given, the user is asked for the file or URL to upload. The bot will then upload the image to the wiki.
  • class UploadRobot

us-states.py

Check pages on the English Wikipedia whether they are in the form Something, State, and if so, create a redirect from Something, ST.

warnfile.py

A robot to implement backlinks from a interwiki.log file without checking them against the live wikipedia.
  • class WarnfileReader
  • class WarnfileRobot

weblinkchecker.py

This bot is used for checking external links found at the wiki. It checks several pages at once, with a limit set by the config variable max_external_links, which defaults to 50.
The bot won't change any wiki pages, it will only report dead links such that people can fix or remove the links themselves.
  • class LinkChecker(object)
  • class LinkCheckThread(threading.Thread)
  • class History
  • class DeadLinkReportThread(threading.Thread)
  • class WeblinkCheckerRobot

windows_chars.py

Script to replace bad Windows-1252 (cp1252) characters with HTML entities on ISO 8859-1 wikis. Don't run this script on a UTF-8 wiki.
  • class SqlWindows1252PageGenerator
  • class WindowsCharsBot

Hilfsmodule

catlib.py

Library to work with category pages on Wikipedia.
  • class Category(wikipedia.Page)

date.py

consists of various lists which are required by some other programs.
  • class FormatDate(object):

editarticle.py

Edit a Wikipedia article with your favourite editor. Requires Python 2.3.
  • class TextEditor
  • class ArticleEditor

fixes.py

pagegenerators.py

  • class AllpagesPageGenerator:
  • class PrefixingPageGenerator:
  • class NewpagesPageGenerator:
  • class FileLinksGenerator:
  • class ReferringPageGenerator:
  • class ReferringPagesGenerator:
  • class CategorizedPageGenerator:
  • class CategoryPartPageGenerator:
  • class LinkedPageGenerator:
  • class TextfilePageGenerator:
  • class GoogleSearchPageGenerator:
  • class MySQLPageGenerator:
  • class YearPageGenerator:
  • class DayPageGenerator:
  • class NamespaceFilterPageGenerator:
  • class RedirectFilterPageGenerator:
  • class CombinedPageGenerator:
  • class CategoryGenerator:
  • class PreloadingGenerator:

query.py

simplejson.__init__.py

A simple, fast, extensible JSON encoder and decoder
JSON (JavaScript Object Notation) <http://json.org> is a subset of JavaScript syntax (ECMA-262 3rd edition) used as a lightweight data interchange format.
simplejson exposes an API familiar to uses of the standard library marshal and pickle modules.

simplejson.decoder.py

Implementation of JSONDecoder
  • class JSONDecoder(object)

simplejson.encoder.py

Implementation of JSONEncoder
  • class JSONEncoder(object)

simplejson.jsonfilter.py

  • class JSONFilter(object)

simplejson.scanner.py

Iterator based sre token scanner
  • class Scanner(object)

test.py

titletranslate.py

wiktionary\entry.py

  • class Entry

wiktionary\header.py

  • class Header(object)

wiktionary\meaning.py

  • class Meaning

wiktionary\sortonlanguagename.py

  • class sortonlanguagename

wiktionary\structs.py

Basic structures for wiktionary.py

wiktionary\term.py

  • class Term
  • class Noun(Term)
  • class Adjective(Term)
  • class Verb(Term)

wiktionary\wiktionarypage.py

This module contains code to store Wiktionary content in Python objects. The objects can output the content again in Wiktionary format by means of the wikiWrap methods
The code is still very much alpha level and the scope of what it can do is still rather limited, only 3 parts of speech, only 2 different Wiktionary output formats, only langnames matrix for about 8 languages. One of the things on the todo list is to harvest the content of this matrix dictionary from the various Wiktionary projects. GerardM put them all on line in templates already.
  • class WiktionaryPage

Testmodule

testfamily.py This utility's primary use is to find all mismatches between the namespace naming in the family files and the language files on the wiki servers.

If the -all parameter is used, it runs through all known languages in a family.

wiktionarytest.py

Unit tests for Wiktionary.py
  • class KnownValues(unittest.TestCase)
  • class SortEntriesCheckSortOrder(unittest.TestCase)
  • class TestKnownValuesInParser(unittest.TestCase)
  • class ToRomanBadInput(unittest.TestCase)

xmltest.py

This is a minimal script to parse an XML file such as the sax_parse_bug.dat dumps that interwiki.py can make if something fails. The only goal of this program is to get a stack trace listing line number and column of the invalid character in the XML.

wiktionary\headertest.py

Unit tests for header.py
  • class KnownValues(unittest.TestCase)

wiktionary\meaningtest.py

  • class KnownValues(unittest.TestCase)

wiktionary\termtest.py

Unit tests for term.py
  • class KnownValues(unittest.TestCase):

wiktionary\wiktionarypagetest.py

Unit tests for Wiktionarypage.py
  • class SortEntriesCheckSortOrder(unittest.TestCase)
  • class TestKnownValuesInParser(unittest.TestCase)

Sonstiges

wikicomserver.py

This library allows the use of the pywikipediabot directly from COM-aware applications.
  • class Wiki
  • class WikiPage(wikipedia.Page)

wiktionary.py

This module contains code to store Wiktionary content in Python objects.
The objects can output the content again in Wiktionary format by means of the wikiWrap methods
The code is still very much alpha level and the scope of what it can do is still rather limited, only 3 parts of speech, only 2 different Wiktionary output formats, only langnames matrix for about 8 languages. On of the things on the todo list is to harvest the content of this matrix dictionary from the various Wiktionary projects. GerardM put them all in templates already.
  • class sortonname
  • class WiktionaryPage
  • class Entry
  • class Meaning
  • class Term
  • class Noun(Term)
  • class Adjective(Term)
  • class Header

Archiv

archive\are-identical.py

Simple bot to check whether two pages with the same name on different language 'pedias have interwiki links to the same page on another language.
It may raise exceptions on pages that disappeared or whatever. This is a simple framework at least for the moment.

archive\brackethttp.py

Script to correct URLs like (http://www.example.org) to example.org to have correct generation of links in Wikipedia

archive\check_extern.py

This bot is used for checking external links from Wikipedia. It checks all external links in groups of 480 pages, gives the error code for each that causes problems, and counts the number of links with and without problems.
  • class URLerrorFinder(urllib.FancyURLopener)

archive\copy_table.py

Script to copy a table from one Wikipedia to another one, translating it on-the-fly.

archive\extract_names.py

Script to extract all wiki page names a certain HTML file points to

archive\getimages.py

Script to transfer many images from one wiki to another. Your language (which can be changed with the -lang: argument) is the language to upload to. The images should be in a file as interwiki links (that is in the form ); they do not need to be all from the same Wiki. This file can be created with extract_wikilinks.py.

archive\translator.py

This module translates a string from one language to another, using translations given in a hard-coded dictionary. Various dictionaries exist for different types of text; e.g. type 'geography' is for tables about places and regions, and 'city' is for tables about cities and villages.

archive\WdT.py

This bot consists of WdT.py and WdTXMLpaser.py and imports XML-files into Wikipedia.
The XML-file contains the an automatic generated list of the most significant word in current events which the bot use as article-links and compare to a local list of all articles. Only the not-yet-written articles will be saved on wikipedia.

archive\WdTXMLParser.py

  • class WdTXMLParser(ContentHandler)