原題:
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: “ACGAATTCCG”. When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.
Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.
For example
Given s = “AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT”,
Return: [“AAAAACCCCC”, “CCCCCAAAAA”].
分析:題意爲給定一個字符串,要求出現次數大於1的長度爲10的子串集合。
思路是遍歷一遍字符串,截取長度爲10的子串,記錄出現的次數,很容易想到map
Java實現如下:
public class Solution {
public List<String> findRepeatedDnaSequences(String s) {
List<String> ls = new ArrayList<String>();
if(s.length()==0 || s==null) return ls;
int len = s.length();
Map<String,Integer> map = new HashMap<String,Integer>();
for (int i = 0; i <= len-10; i++) {
String str = s.substring(i,i+10);
if(map.containsKey(str)){
map.put(str, map.get(str)+1);//重複出現的子串次數+1
}else{
map.put(str, 1);//新出現的子串加入map
}
}
for(Map.Entry<String, Integer> ent:map.entrySet()){
if(ent.getValue()>1) ls.add(ent.getKey());
}
return ls;
}
}