{"id":269,"date":"2013-04-06T23:40:31","date_gmt":"2013-04-07T06:40:31","guid":{"rendered":"http:\/\/blog.ls-al.com\/?p=269"},"modified":"2013-04-06T23:42:49","modified_gmt":"2013-04-07T06:42:49","slug":"python-manipulating-xml","status":"publish","type":"post","link":"https:\/\/blog.ls-al.com\/python-manipulating-xml\/","title":{"rendered":"Python Manipulating XML"},"content":{"rendered":"

A short script with descriptions for reading and manipulating xml. \u00a0It seems like the python ElementTree module should be the easiest and best suited for XML manipulation. \u00a0However I had a complicated XML structure with multiple namespaces and lxml handled it better. \u00a0ElementTree could only handle one namespace with it;s register function. In addition lxml has pretty_print which might be useful. \u00a0Although in my case when I do inserts pretty_print did not work even with the FAX fix for remove_blank_text.<\/p>\n

\r\nimport lxml.etree as ET\r\n\r\nf = open('config.xml','rb')\r\n## http:\/\/lxml.de\/FAQ.html#why-doesn-t-the-pretty-print-option-reformat-my-xml-output\r\n#parser = ET.XMLParser(remove_blank_text=True)\r\n#tree = ET.parse(f, parser)\r\ntree = ET.parse(f)\r\n\r\n#for element in tree.iter():\r\n#    element.tail = None\r\n\r\nroot = tree.getroot()\r\nnamespace="http:\/\/xmlns.oracle.com\/weblogic\/domain"\r\nservers = tree.findall('.\/\/{%s}server' % namespace)\r\n\r\n## Loop through the nodes we found\r\nfor server in servers:\r\n  print "New SERVER node detected:"\r\n  for child in server:\r\n    tag = child.tag\r\n    val = child.text\r\n    ## Remove any existing children\r\n    if tag == "{http:\/\/xmlns.oracle.com\/weblogic\/domain}ssl":\r\n      print "found server.ssl and will remove",\r\n      server.remove(child)\r\n    if tag == "{http:\/\/xmlns.oracle.com\/weblogic\/domain}log":\r\n      print "found server.log and will remove",\r\n      server.remove(child)\r\n    if tag == "{http:\/\/xmlns.oracle.com\/weblogic\/domain}data-source":\r\n      print "found server.data-source and will remove",\r\n      server.remove(child)\r\n    print tag, val\r\n  \r\n  ## Add the 3 children we want \r\n  child = ET.Element("ssl")\r\n  child.text=''\r\n  server.insert(1,child)\r\n  ##  Check out why xsi:nil is not working. UTF???\r\n  ##  gchild = ET.Element("hostname-verifier",attrib={'xsi:nil':'true'})\r\n  gchild = ET.Element("hostname-verifier",attrib={'xsi_nil':'true'})\r\n  gchild.text=''\r\n  child.insert(1,gchild)\r\n  gchild = ET.Element("hostname-verification-ignored")\r\n  gchild.text='true'\r\n  child.insert(2,gchild)\r\n  gchild = ET.Element("client-certificate-enforced")\r\n  gchild.text='true'\r\n  child.insert(3,gchild)\r\n  gchild = ET.Element("two-way-ssl-enabled")\r\n  gchild.text='false'\r\n  child.insert(3,gchild)\r\n  \r\n  child = ET.Element("log")\r\n  child.text=''\r\n  server.insert(2,child)\r\n  gchild = ET.Element("rotation-type")\r\n  gchild.text='byTime'\r\n  child.insert(1,gchild)\r\n  gchild = ET.Element("number-of-files-limited")\r\n  gchild.text='true'\r\n  child.insert(2,gchild)\r\n  gchild = ET.Element("rotate-log-on-startup")\r\n  gchild.text='true'\r\n  child.insert(3,gchild)\r\n  \r\n  child = ET.Element("data-source")\r\n  child.text=''\r\n  server.insert(3,child)\r\n  gchild = ET.Element("data-source-log-file")\r\n  gchild.text=''\r\n  child.insert(1,gchild)\r\n  ggchild = ET.Element("rotation-type")\r\n  ggchild.text='byTime'\r\n  gchild.insert(1,ggchild)\r\n  ggchild = ET.Element("number-of-files-limited")\r\n  ggchild.text='true'\r\n  gchild.insert(2,ggchild)\r\n  ggchild = ET.Element("rotate-log-on-startup")\r\n  ggchild.text='true'\r\n  gchild.insert(3,ggchild)\r\n\r\n## Check out why pretty_print is not making newlines in new tags  \r\n#print(ET.tostring(tree, pretty_print=True))\r\ntree.write("wc-out.xml", pretty_print=True)\r\n<\/pre>\n","protected":false},"excerpt":{"rendered":"

A short script with descriptions for reading and manipulating xml. \u00a0It seems like the python ElementTree module should be the<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[13,28],"tags":[],"class_list":["post-269","post","type-post","status-publish","format-standard","hentry","category-python","category-xml"],"_links":{"self":[{"href":"https:\/\/blog.ls-al.com\/wp-json\/wp\/v2\/posts\/269","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blog.ls-al.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.ls-al.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.ls-al.com\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.ls-al.com\/wp-json\/wp\/v2\/comments?post=269"}],"version-history":[{"count":0,"href":"https:\/\/blog.ls-al.com\/wp-json\/wp\/v2\/posts\/269\/revisions"}],"wp:attachment":[{"href":"https:\/\/blog.ls-al.com\/wp-json\/wp\/v2\/media?parent=269"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.ls-al.com\/wp-json\/wp\/v2\/categories?post=269"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.ls-al.com\/wp-json\/wp\/v2\/tags?post=269"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}