spiders bug using (?<variable>.*?)

You find a bug! You post it here!

spiders bug using (?<variable>.*?)

Postby dgemily on Sat Sep 16, 2006 3:12 pm

hi steven,
the variable (?<variable>.*?) doesn't work with the lasts exe...
we can't get the field if the spider use (?<variable>.*?) for this one.. still working with an old xlobby2.exe.

here is an example of one spider using it:

Code: Select all
downloadresultsonly
limit=0
url=http://www.marmiton.org/mds/recherche_ok.cfm?startr=1&criteria=%searchstring%&ContenuRecherche=0&TypePlat=dessert
results=<a href="..(?<url>/recettes/recette.cfm.*?)" class="lienblanc"><font><b>(?<display>.*?)</b></font>

//find title
<span><b>(?<title>.*?)</b></span>

//find type
<img src="../pix/recettes/recette/bullette_header.gif"><IMG SRC="../pix/recettes/recette/.*?.gif" BORDER="0" ALT="(?<type>.*?)">&nbsp;

//find rating
<NOBR><IMG SRC="../pix/recettes/recette/m-vert.gif" border="0" alt="(?<rating>.*?)"><IMG SRC

//find plot
spirales.gif"(?<variable>.*?)bas_recette.gif"
<td><span>(?<plot>.*?)</span></TD>


with this one, we can't get the field "plot".

thanks
Member of XLobby FR Team
~~~~~~~~~~~~~~~~
Releases from the French team
dgemily
 
Posts: 795
Joined: Thu May 13, 2004 6:24 am
Location: Paris, France

Postby tomprout on Tue Oct 10, 2006 11:49 am

+1
tomprout
 
Posts: 54
Joined: Wed Mar 15, 2006 9:20 am

Postby lar282 on Tue Oct 10, 2006 11:58 am

more and more vital functions are broke in the latest exe

steven please help us all out here


//Lasse
Image
lar282
 
Posts: 1682
Joined: Thu Apr 01, 2004 4:13 pm
Location: Helsingborg, Sweden

Postby rika on Wed Oct 11, 2006 5:31 pm

yes, please take some time for the lastest bugs.

Rika
Skins: Dragon Vintage HeavyMetalFlash Green In Work
-------------------------Have Fun!----------------------------
Image
rika
 
Posts: 403
Joined: Fri Apr 02, 2004 5:43 am
Location: Sweden

Postby tswhite70 on Sat Oct 14, 2006 4:56 pm

+1!

tsw
Image
tswhite70
 
Posts: 319
Joined: Tue Jan 06, 2004 3:44 pm
Location: Houston, Tx

Postby S Pittaway on Sun Oct 15, 2006 9:37 am

+2
S Pittaway
 
Posts: 713
Joined: Wed Jan 25, 2006 11:08 am
Location: Manchester, England

Postby m_ski on Mon Oct 16, 2006 10:56 am

+1
m_ski

Silverstone LC03-V, Intel E2220 2.4GHz, Scythe Mini Ninja Heatsink,
Gigabyte GA-P35-DS3L , Crucial 2Gb PC2-6400,
Sapphire HD 3470 256MB (Passive),
120Gb Samsung Spinpoint SATA, 500Gb Seagate FreeAgent USB2,
DVB-T: DNTV Live! Pro, DNTV TinyUSB2
m_ski
 
Posts: 207
Joined: Wed Dec 08, 2004 7:57 am
Location: Kent, United Kingdom

Postby slaman on Sat Oct 28, 2006 5:47 am

bump - please help us out.

I really need a spider that can grab the actors information from imdb.com - the spider I have doesn't... (grabs everything else though!)

first billed only: </b></td></tr> (?<variable><tr>.*?</tr><tr>.*?</tr><tr>.*?</tr><tr>.*?</tr><tr>.*?</tr>)
<td valign="top"><a href="/name/.*?">(?<actors>.*?)</a></td>
slaman
 
Posts: 147
Joined: Sat Oct 14, 2006 10:30 pm

Postby stevenhanna6 on Sat Nov 25, 2006 12:29 am

dgemily, does the spider posted still work...I the website changed their search system so I can't look into variable since I dont have any spiders that use "variable"...you have another one you can post?
stevenhanna6
 
Posts: 934
Joined: Tue Feb 18, 2003 10:39 am
Location: Ontario, Canada

Postby dgemily on Sat Nov 25, 2006 1:59 pm

stevenhanna6 wrote:dgemily, does the spider posted still work...I the website changed their search system so I can't look into variable since I dont have any spiders that use "variable"...you have another one you can post?


yep, this one:
Code: Select all
url=http://us.imdb.com/find?q=%searchstring%;tt=1
results=<a href="(?<url>/title/.*?/).*?">(?<display>.*?)</a>


//title & year
<strong>(?<title>.*?)<small>\(<a href="/Sections/Years/.*?">(?<year>.*?)</a>

//store contents to variable....next regex uses that variable!!!
<b>Directed by</b>(?<variable>.*?)<b>Writing credits</b>
<a href="/name/.*?">(?<director>.*?)</a>

<a href="/Sections/Genres/.*?">(?<genre>.*?)</a>

<a href="/mpaa">MPAA</a>:</b>(?<rating>.*?)<br>

<b>Runtime:</b>(?<runtime>.*?)<br>


first billed only: </b></td></tr> (?<variable><tr>.*?</tr><tr>.*?</tr><tr>.*?</tr><tr>.*?</tr><tr>.*?</tr>)
<td><a href="/name/.*?">(?<actors>.*?)</a></td>


//find imdb rating
User Rating.*<a href="ratings">.*<b>(?<IMDB_rating>.*)</b>.*votes\)

//watch the ( ) they must have the escape character!!! ---> \
<a href="(?<url>/rg/title-tease/plot.*?/plotsummary)">\(more\)</a>
<p>(?<plot>.*?)</p>

and another example:
Code: Select all
url=http://www.allocine.fr/recherche/?motcle=%searchstring%&f=3&rub=6
results=<td><h4><a href="(?<url>/series/ficheserie_gen_cserie=.*?)" class="link1">(?<display>.*?)</a></h4>

//find actors
<b>Acteurs principaux</b>(?<variable>.*?)>Le casting complet<
/><br>(?<actors>.*?)</a>

//find directors
<h4>S.*?par <a(?<variable>.*?)</h4>(?<director>.*?)</a>

//find genre
-&nbsp;Genre : (?<genre>.*?)</h4>

//find runtime
<div><h4>
Format : (?<runtime>.*?)&

//find rating
>Spectateurs</a> <img src="http://a69.g.akamai.net/n/69/10688/v1/img5.allocine.fr/acmedia/skin/allocinev5/icone/etoile_(?<rating>.*?).gif">Synopsis<.*?<h4>(?<plot>.*?)</h4>

//find coverart
style="padding:0 0 5 0".*?<img src="(?<coverart>http://a69.g.akamai.net/n/69/10688/v1/img5.allocine.fr/acmedia/.*?)">



you will not get information for those fields : actors and director (for both)

I can send you some others spiders using (?<variable>.*?) if you need...

thx later
Member of XLobby FR Team
~~~~~~~~~~~~~~~~
Releases from the French team
dgemily
 
Posts: 795
Joined: Thu May 13, 2004 6:24 am
Location: Paris, France

Postby slaman on Thu Dec 28, 2006 9:53 pm

Any updates on this?

I'm desperately trying to find a spider that can get me high-res coverart (front cover only) and actor information... None out there do that at the moment.
slaman
 
Posts: 147
Joined: Sat Oct 14, 2006 10:30 pm

Postby AgentJBOND on Wed Jan 10, 2007 4:43 am

yeah, i too am bummed that i can't get actors anymore through imdb :(
AgentJBOND
 
Posts: 3
Joined: Sun Jun 11, 2006 3:09 am


Return to Bug Tracking / Reports

Who is online

Users browsing this forum: No registered users and 1 guest

cron