python - Using text(), is there a way to convert empty text to 'None' with scrapy -

i'm running problem. website xml i'm scraping has values empty, need preserve order of values.

sample:

<thedata>     <some-item>         <value xsi:nil="true"/>         <value xsi:nil="true"/>         <value xsi:nil="true"/>         <value xsi:nil="true"/>         <value xsi:nil="true"/>         <value>44</value>         <value>32</value>         <value>31</value>         <value xsi:nil="true"/>         <value xsi:nil="true"/>         <value>32</value>         <value>31</value>         <value>34</value>         <value>34</value>         <value>33</value>     </some-item> </thedata>

doing text() ignore empty values:

class myspider(xmlfeedspider):     name = 'myspider'     start_urls = ['http://www.example.com/somexml.xml']     itertag = 'thedata'      # using xmlfeedspider     def parse_node(self, response, node):         item_vals = node.select('some-item/value/text()').extract()         print item_vals

this print list contains values have integer.

since need preserve order, there way tell scrapy replace empty values '' or none?

edit: @unutbu: i'm still getting same problem:

    item_vals = node.select('some-item/value/text()').extract()     print item_vals     item_vals2 = node.select('some-item/value/text()').extract() or none     print item_vals2

output:

    [u'44',u'32',u'31',u'32',u'31',u'34',u'34',u'33']     [u'44',u'32',u'31',u'32',u'31',u'34',u'34',u'33']

what want is:

    [none,none,none,none,none,u'44',u'32',u'31',none,none,u'32',u'31',u'34',u'34',u'33']

or represents empty value when encountered.

you need select all value nodes, , extract text (if any) each piece:

[txt item in hxs.select('some-item/value') txt in item.select('text()').extract() or [u'']]

Search This Blog

Bready

python - Using text(), is there a way to convert empty text to 'None' with scrapy -

Comments

Post a Comment

Popular posts from this blog

ios - iPhone/iPad different view orientations in different views , and apple approval process -

php - HTTP_REFERER woes: How can I allow access to a specific page, only when a visitor has visited another specific page beforehand? -

java Extracting Zip file -