Problem

A string is simply an ordered collection of symbols selected from some alphabet and formed into a word; the length of a string is the number of symbols that it contains.

An example of a length 21 DNA string (whose alphabet contains the symbols ‘A’, ‘C’, ‘G’, and ‘T’) is “ATGCTTCAGAAAGGTCTTACG.”

Given: A DNA string s of length at most 1000 nt.

Return: Four integers (separated by spaces) counting the respective number of times that the symbols ‘A’, ‘C’, ‘G’, and ‘T’ occur in s.

Sample Dataset

AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGC

Sample Output

20 12 17 21

解答

这道题给出一段DNA序列,要求给出ACGT的频率,这个很容易,读文件,计数而已。

Python有count函数,直接帮我们计好数了。

FILE=open("DATA/rosalind_dna.txt", "r")
dna=FILE.read()
FILE.close()
print(dna.count("A") , dna.count("C"), dna.count("G"), dna.count("T"))

由于这道题太简单,我们不防用C也来写一段。

#include <stdio.h>  
  
int main() {
  FILE *INFILE;
  INFILE = fopen("DATA/rosalind_dna.txt", "rt");  
  char nt;  
  int a_cnt, c_cnt, g_cnt, t_cnt; 
  a_cnt = c_cnt = g_cnt = t_cnt = 0;  
  while( (nt = fgetc(INFILE)) != EOF) {  
      switch(nt) {  
          case 'A':
              a_cnt++;  
              break;  
           case 'C':
              c_cnt++;  
              break;   
           case 'G':
              g_cnt++;  
              break;    
           case 'T':
              t_cnt++;  
              break;
    }
  }  
  printf("%d %d %d %d\n", a_cnt, c_cnt, g_cnt, t_cnt);    
  return 0;
}