Jump to content

User talk:ClueBot Commons: Difference between revisions

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Content deleted Content added
SineBot (talk | contribs)
m Signing comment by Gigantopithecusman - ""
No edit summary
Line 253: Line 253:
Talk this out with Acroterion, but you will not stop me inserting more sightings. Have a look on the web and in"Tracking the Chupacabra" for more information on that particular sighting
Talk this out with Acroterion, but you will not stop me inserting more sightings. Have a look on the web and in"Tracking the Chupacabra" for more information on that particular sighting
Please stop thinking of me as a vandal <small><span class="autosigned">—&nbsp;Preceding [[Wikipedia:Signatures|unsigned]] comment added by [[User:Gigantopithecusman|Gigantopithecusman]] ([[User talk:Gigantopithecusman|talk]] • [[Special:Contributions/Gigantopithecusman|contribs]]) 08:37, 8 March 2016 (UTC)</span></small><!-- Template:Unsigned --> <!--Autosigned by SineBot-->
Please stop thinking of me as a vandal <small><span class="autosigned">—&nbsp;Preceding [[Wikipedia:Signatures|unsigned]] comment added by [[User:Gigantopithecusman|Gigantopithecusman]] ([[User talk:Gigantopithecusman|talk]] • [[Special:Contributions/Gigantopithecusman|contribs]]) 08:37, 8 March 2016 (UTC)</span></small><!-- Template:Unsigned --> <!--Autosigned by SineBot-->

Why do you think that I vandalised it, you robot? I will put it back, and you please talk with Acroterion [[User:Gigantopithecusman|Gigantopithecusman]] ([[User talk:Gigantopithecusman|talk]]) 08:45, 8 March 2016 (UTC)

Revision as of 08:45, 8 March 2016

The current status of ClueBot NG is: Running
The current status of ClueBot III is: Running
Praise should go on the praise page. Barnstars and other awards should go on the awards page.
Use the "new section" button at the top of this page to add a new section. Use the [edit] link above each section to edit that section.
This page is automatically archived by ClueBot III.
The ClueBots' owner or someone else who knows the answer to your question will reply on this page.

Template:Archive box collapsible

ClueBots
ClueBot NG/Anti-vandalism · ClueBot II/ClueBot Script
ClueBot III/Archive · Talk page for all ClueBots
Beware! This user's talk page is monitored by talk page watchers. Some of them even talk back.

Redirect catching

Is it possible to create a filter to catch this? I noticed it creates false positives on the new pages feed, as I have caught two ones tonight, one for "fisting" and the other for "KKK". Kevin Rutherford (talk) 02:25, 4 March 2016 (UTC)[reply]

(talk page stalker) ClueBot doesn't operate on filters, but rather a variety of techniques described in detail at User:ClueBot_NG#Vandalism_Detection_Algorithm. If such edits are frequent, you could try raising the matter at WT:EF to discuss a possible edit filter which could stop such edits before they go through. jcgoble3 (talk) 05:59, 4 March 2016 (UTC)[reply]

Subroutine generatedetailedindex should be rewritten

proposal

 /**
 * @generatedetailedindex ; called from @generateindex ;  @generatedetailedindex_begin ; @generatedetailedindex_begin ; @generatedetailedindex_begin 
 * Generate the detailled index of an archive page.
 * ever read the archive; ever read it's index; if required, rebuild the index and write it.
 *
 * @todo: compare the timestamps of the archive and it's index
 *
 * @param $apage. The name of the page.
 * @param $level. The level of subsections. Most often is 2.
 * @param $adata. The contents of the page if already known.
 * @param $ret.   Write the page or not. Default to false... meaning write !
 * @return 0 if no sections found ; 1 if chksum ok ; 2 if index written
 **/

function mygeneratedetailedindex ($apage,$level,$adata=null,$ret=false) {

 global $user;
 global $wpq;
 global $wpi;
 global $wpapi;
 global $botmsg;
 global $isbot;
 global $ischeckrun;
 $i = 1;
 $version = '1.1';
 if ($adata === null) $adata = $wpq->getpage($apage);
 $month = array(
   'January' => 1, 'February' => 2, 'March' => 3, 'April' => 4, 'May' => 5, 'June' => 6, 
   'July' => 7, 'August' => 8, 'September' => 9, 'October' => 10, 'November' => 11, 'December' => 12,
   'Jan' => 1, 'Feb' => 2, 'Mar' => 3, 'Apr' => 4, 'Jun' => 6, 
   'Jul' => 7, 'Aug' => 8, 'Sep' => 9, 'Oct' => 10, 'Nov' => 11, 'Dec' => 12
   );
   

//------------------------- fixing bad dates

 $pattern = '/(\d{2}):(\d{2}), ([a-zA-Z]+) (\d+), (\d{4}) \(UTC\)/i';
 $replacement = '*** ${1}:${2}, ${4} ${3} ${5} (UTC)';
 if (preg_match($pattern, $adata)==1) {
   echo "
--- bad dates found ---
"; $adata=preg_replace($pattern, $replacement, $adata); if (!$ret) {var_dump( $wpapi->edit($apage, $adata, 'fixing bad dates.'.$msgbot, false,$isbot,null,null,$ischeckrun)); echo "
"; } // action
   // funct. edit parameters (Setting one of the Detailed Indices
   //$page         = $title
   //$data         = $data  (top and bottom are no-included)
   //$summary      = 'Updating detailed index'
   //$minor        = false
   //$bot          = $isbot
   //$wpStarttime  = null
   //$wpEdittime   = null
   //$checkrun     = $ischeckrun
   
 } // datalakon
 

//------------------------- any sections ?

 $milor='/\\n'.str_repeat('=',$level).'[^=]/';
 // echo "---".$milor."---
"; if (preg_match($milor, $adata) == 0) { echo '--- no sections found at '.$apage.' ---
'; return 0; //no sections found }

//------------------------- checksum

 $checksum = md5(md5($version).md5($adata));
 $cdata = $wpq->getpage('User:'.$user.'/Detailed Indices/'.$apage);
 if (preg_match('/\<\!-- CB3 MD5:([0-9a-f]{32}) --\>/i',$cdata,$m)) {   //////////// test is here   !!!!!!!!!!!!!!!!!!!!!!!!!!!!!
         if (trim(strtolower($m[1])) == trim(strtolower($checksum))) {
                 return 1; // checksum
         }
 } 
 

//------------------------- main

 $sects = splitintosections($adata,$level);
 $data = ;
 unset($sects[0]);
 // $header = ."\n".'{|class="wikitable sortable"'."\n".'! Order !! Header !! Start Date !! End Date !! Comments !! Size !! Archive'."\n";
 $header = ."\n".'{|class="wikitable sortable"'."\n".'! # !! Header !!Start Date!!End Date!! Nb!! Size !! Archive'."\n";
 
 foreach ($sects as $sect) {
 
   $data .= '|-'."\n".'| '.$i.' || '.trim($sect['header']).' || ';
   if (preg_match_all('/(\d{2}):(\d{2}), (\d+) ([a-zA-Z]+) (\d{4}) \(UTC\)/i',$sect['content'],$dates,PREG_SET_ORDER)) {
     $times = array();
     $month = array(
       'January' => 1, 'February' => 2, 'March' => 3, 'April' => 4, 'May' => 5, 'June' => 6, 
       'July' => 7, 'August' => 8, 'September' => 9, 'October' => 10, 'November' => 11, 'December' => 12,
       'Jan' => 1, 'Feb' => 2, 'Mar' => 3, 'Apr' => 4, 'Jun' => 6, 
       'Jul' => 7, 'Aug' => 8, 'Sep' => 9, 'Oct' => 10, 'Nov' => 11, 'Dec' => 12
       );
     foreach ($dates as $date) $times[] = gmmktime($date[1],$date[2],0,$month[$date[4]],$date[3],$date[5]);
     sort($times,SORT_NUMERIC);
     // MODIF $data .= gmdate('Y-m-d H:i',$times[0]).' || '.gmdate('Y-m-d H:i',$times[count($times)-1]).' || '.count($times);
     // no spaces in cells to allow sorting by dates; no spaces in date to force single line at rendering
     $data .= gmdate('Y-m-d_H:i',$times[0]).'||'.gmdate('Y-m-d_H:i',$times[count($times)-1]).'||'.count($times);
     
   } else {
     $data .= 'Unknown || Unknown || Unknown';
   }
   $data .= ' || '.strlen($sect['content']).' || [['.$apage.'#'.str_replace(array('','',"","",'Template:','','|'),,trim($sect['header'])).'|'.$apage.']]'."\n";
   $i++;
 }
 $footer = '|}';
 if (!$ret) {$wpapi->edit('User:'.$user.'/Detailed Indices/'.$apage,
                         .$header..$data..$footer.,
                         'Updating detailed index for '.$apage.'.'.$msgbot,
                         false,$isbot,null,null,$ischeckrun);
   echo "--=--
"; }
 // funct. edit parameters (Setting one of the Detailed Indices
 //$page         = $title
 //$data         = $data  (top and bottom are no-included)
 //$summary      = 'Updating detailed index'
 //$minor        = false
 //$bot          = $isbot
 //$wpStarttime  = null
 //$wpEdittime   = null
 //$checkrun     = $ischeckrun
 
 
 return 2; // data written

}

/**
  * @generatedetailedindex__end ; @generatedetailedindex__end ; @generatedetailedindex__end ; @generatedetailedindex__end ; @generatedetailedindex__end ;  
  *
  **/

rationales

When seeing [[1]], one can conclude that the following modifications would be useful.

  1. English dates in archives should be recognized. $pattern = '/(\d{2}):(\d{2}), ([a-zA-Z]+) (\d+), (\d{4}) \(UTC\)/i' should be detected (at least once in a whole archive). When found, it should be replaced by $replacement = '*** ${1}:${2}, ${4} ${3} ${5} (UTC)' and the whole page rewritten.
  2. When the archive page contains no sections, the index page should not be created, and should not be referenced in the Master Detailed Indices.
  3. Dates formatting in the generated indices. We shouldn't have extra spaces in the date cells: this is required to allow sorting by dates (when all dates in the column are known). Moreover, we shouldn't have inner spaces in date cells: this is required to avoid line splitting at rendering
  4. Short months are not recognized. 'January' is the standard, but 'Jan' should be recognized also. Moreover, table $month should be a global variable, instead of being recreated at each section (circa 10,000 times for the Administrator's Noticeboard).
  5. Detection if the index page should be rewritten. I don't understand why a '/\<\!-- CB3 MD5:([0-9a-f]{32}) --\>/i' pattern is used, since it requires a full lecture of the index page. The test should be: proceed only if the index page is missing or older than the archive page.
  6. returning $data is a fossilized remain from the old (Main Detailed Indices is build by transclusion, not by writting in full). For the present, this return is only a space eater. Instead, we should use $ret to signify 2: indices was written, 1: writting was skipped (indices newer then archive), 0: empty archive, don't list it.


Pldx1 (talk) 10:52, 4 March 2016 (UTC)[reply]

Subroutine generateindex should be rewritten

rationales

When used over a page with many archives, the resulting MasterDetailedIndex is on overflow. Here are some ideas about fixing this problem

  1. When there are more than a given number of archives (maybe 50, maybe 100), the process shouldn't be called at all from the template {{User:ClueBot III/ArchiveThis}}, i.e. we should have nogenerateindex=yes.
  2. When there are more than 100 archives, the MasterDetailledIndex must be split in several parts, each transcluding less than 100 Indexes. And then the MDI should be a page that links to the MDI-Parts. For example, there are 915 Wikipedia:Administrators'_noticeboard/IncidentArchive... requiring 10 MDI-parts...
  3. This routine should use the return code given by generatedetailedindex to not list an archive file that doesn't contain any section (i.e. aren't genuine archives).
  4. If we change the header in the DI (cf the subroutine), the header fo the MDI should be changed accordingly.
  5. Sorting the archives by their last modif date is not so evident when one of these archives was modified for any reason (e.g. some random gnome decides that any of his hobby should be changed). Maybe we can reuse a page like Wikipedia:Administrators'_noticeboard/IncidentArchives to force the ordering (of most of the pages).
Pldx1 (talk) 13:22, 4 March 2016 (UTC)[reply]

Multiple copies

If you gave out the source code, would it be possible for someone to edit the code to moderate a different list of pages? — Preceding unsigned comment added by Bell MT (talkcontribs) 16:12, 4 March 2016 (UTC)[reply]

The source for ClueBot NG is open, see User:ClueBot_NG#Source_Code. —k6ka 🍁 (Talk · Contributions) 22:09, 4 March 2016 (UTC)[reply]

Archiving trouble

The page WP:FFU has this configuration template:

{{User:ClueBot III/ArchiveThis
 |archiveprefix=Wikipedia:Files for upload/
 |format=F Y
 |age=4800
 |archivenow=<nowiki>{{User:ClueBot III/ArchiveNow}}</nowiki>
 |header={{Aan}}
 |headerlevel=2
 |maxarchsize=250000
}}

The idea is that sections only are to be archived if they are closed, and closed sections always contain {{User:ClueBot III/ArchiveNow}}, while unclosed sections do not contain this template. For this reason, the age parameter is set to a high value (a bit over half a year). Unfortunately, there are lots of closed sections which haven't been archived for a long time, and I can't figure out why. Is anyone able to help? --Stefan2 (talk) 18:04, 5 March 2016 (UTC)[reply]

Speedy delete vandalism

Hi,

I was wondering why ClueBot NG doesn't check new pages for vandalism and tag such pages for G3? It only seems to check edits to existing articles; or at least I've never seen ClueBot NG tag a newly created page consisting entirely of vandalism. Adam9007 (talk) 17:10, 6 March 2016 (UTC)[reply]

What made you think that I vandalised the chupacabra page? Talk this out with Acroterion, but you will not stop me inserting more sightings. Have a look on the web and in"Tracking the Chupacabra" for more information on that particular sighting Please stop thinking of me as a vandal — Preceding unsigned comment added by Gigantopithecusman (talkcontribs) 08:37, 8 March 2016 (UTC)[reply]

Why do you think that I vandalised it, you robot? I will put it back, and you please talk with Acroterion Gigantopithecusman (talk) 08:45, 8 March 2016 (UTC)[reply]