python - Using text(), is there a way to convert empty text to 'None' with scrapy -


i'm running problem. website xml i'm scraping has values empty, need preserve order of values.

sample:

<thedata>     <some-item>         <value xsi:nil="true"/>         <value xsi:nil="true"/>         <value xsi:nil="true"/>         <value xsi:nil="true"/>         <value xsi:nil="true"/>         <value>44</value>         <value>32</value>         <value>31</value>         <value xsi:nil="true"/>         <value xsi:nil="true"/>         <value>32</value>         <value>31</value>         <value>34</value>         <value>34</value>         <value>33</value>     </some-item> </thedata> 

doing text() ignore empty values:

class myspider(xmlfeedspider):     name = 'myspider'     start_urls = ['http://www.example.com/somexml.xml']     itertag = 'thedata'      # using xmlfeedspider     def parse_node(self, response, node):         item_vals = node.select('some-item/value/text()').extract()         print item_vals 

this print list contains values have integer.

since need preserve order, there way tell scrapy replace empty values '' or none?

edit: @unutbu: i'm still getting same problem:

    item_vals = node.select('some-item/value/text()').extract()     print item_vals     item_vals2 = node.select('some-item/value/text()').extract() or none     print item_vals2 

output:

    [u'44',u'32',u'31',u'32',u'31',u'34',u'34',u'33']     [u'44',u'32',u'31',u'32',u'31',u'34',u'34',u'33'] 

what want is:

    [none,none,none,none,none,u'44',u'32',u'31',none,none,u'32',u'31',u'34',u'34',u'33'] 

or represents empty value when encountered.

you need select all value nodes, , extract text (if any) each piece:

[txt item in hxs.select('some-item/value') txt in item.select('text()').extract() or [u'']] 

Comments