2017-10-06 14 views
2

I parsed this data from Wikipedia and trying to get only characters from here。 But the result comes with \n* in the front of data。How to split parsed String data without special characters?

"": "===猫の種類=== \ n [[サイアム猫] \ n * [[ペルシャネブシェヒルスカヤ]] \ n * [[ペルシャ] \ n * [[ノルウェーのジオンフォレスト] \ n * [[トルコの時アンゴラ] \ n * [[アメリカンショートヘア] \ n * [[ブリティッシュショートヘア] \ n * [[ロシアンブルー] \ n * [[ベンガル] \ n * [[メインクーン] \ n * [[レクドル] \ n * [[ヒマラヤン] \ n * [[ジャパニーズ ご飯テール] \ n * [[オリエンタルショートヘア]] \ n * [[ピーターボールド] \ n * [[スコティッシュフォールド] \ n *スコティッシュストレート\ n * [[ハイランドフォールド] \ n * [[シベリアフォレスト] \ n * [ 【トルコの時半] \ n * [[コリアンショートヘア] \ n * [[オールブラック] \ n * [[社ナケト]] \ n * [[クナ] \ n * [[アビシニアン] \ n *マンチキン」

This is my code。

try { 
     URL url = new URL("https://ko.wikipedia.org/w/api.php?action=query&prop=revisions&rvprop=content&rvsection=20&titles=%EA%B3%A0%EC%96%91%EC%9D%B4&format=json"); 
     URLConnection con = url.openConnection(); 
     InputStream is = con.getInputStream(); 
     InputStreamReader isr = new InputStreamReader(is); 
     BufferedReader reader = new BufferedReader(isr); 

     while(true){ 
      String data = reader.readLine(); 
      if(data == null) break; 
      result += data; 
     } 
     JSONObject obj = new JSONObject(result); 
     JSONObject query = (JSONObject) obj.get("query"); 
     JSONObject pages = (JSONObject) query.get("pages"); 
     JSONObject pageid = (JSONObject) pages.get("93349"); 
     JSONArray revisions = (JSONArray) pageid.get("revisions"); 
     String catcat = String.valueOf(revisions); 
     String star = "\n*"; 
     catcat = catcat.replaceAll("\\[\\[","").replaceAll("\\]\\]",",").replaceAll("\\r|\\n", "").replaceAll(star,""); 
     String[] catcategory = catcat.split(","); 


     for (int i = 0; i<catcategory.length;i++){ 
      list.add(catcategory[i]); 

     } 






    } catch (MalformedURLException e) { 
     e.printStackTrace(); 
    } catch (IOException e) { 
     e.printStackTrace(); 
    } catch (JSONException e) { 
     e.printStackTrace(); 
    } 

Result for this looks like

\ n サイアム猫
\ n
ペルシャ

and I want to remove \n*

+1

Use 'text.replaceAll(" \ n * "、" ");' –

+1

Add one more \ in 'String star =" \\ n \ * ";' –

+0

I've tried。 it does not work。 – coooldoggy

答えて

0

How to split parsed String data without special characters?

Try this piece of code、It's removed \ n *、Then you can add _result_word to your list。

for (int i = 0; i < catcategory.length; i++) { 
      try { 
       String _result_word = catcategory[i].replaceFirst("\\\\n", "").replace("*", ""); 
       //String _result_word=catcategory[i].replaceFirst("\\\\n", "").replace("*", "").replaceFirst("\\\\n", "").replace("*", ""); 
       System.out.println("" + _result_word); 
       list.add(_result_word); 
      } catch (Exception ex) { 
       System.out.println("Special Exception occurred at index : i = " + i); 
       ex.printStackTrace(); 
      } 
     } 
+0

Also、You can use:String _result_word = catcategory [i] .replaceFirst( "\\\\ n"、 "").replace( "*"、 "").replaceFirst( "\ \\\ n "、" ").replace(" * "、" "); –

+0

Thanks! This worked! If you can、can you explain how this worked? – coooldoggy

0

Everything correct except one line where you need escape asterisk character and escape slash character

String star = "\\\\n\\*"; 
str.replaceAll(star, ""); 
+0

@Sanoop no、this is correct answer。 Topicstarter code works correct with my fix – Romadro