R基礎(三)


在R中,字符串其實是字符向量元素。

創建和打印字符串

1.字符向量可以用c函數創建,儘量使用雙引號

c(
  "you may not believe",
  "I ofen imagined how divine and adorable you can be"
  )

Out:

[1] "you may not believe"                               
[2] "I ofen imagined how divine and adorable you can be"

2.paste函數可以將不同的字符串組合起來。sep參數更改分隔符,collapse參數把結果收縮爲一個包含所有元素的字符串。

paste(c("author","作者"),"朱生豪")
paste(c("author","作者"),"朱生豪",sep=":")
paste(c("author","作者"),"朱生豪",collapse="and")

Out:

[1] "author 朱生豪" "作者 朱生豪"  
[1] "author:朱生豪" "作者:朱生豪" 
[1] "author 朱生豪and作者 朱生豪"

3.toString函數在打印向量的時候非常有用,width參數限制輸出的字符個數

x <- (1:5)^2
toString(x)
toString(x,width=12)

Out:

[1] "1, 4, 9, 16, 25"
[1] "1, 4, 9,...."

4.當字符串打印到控制檯時,會以雙引號括起來,noquote函數可以去掉這些字符串的雙引號

x <- c("But","when","i","have","seen","you","eventually")
y <- noquote(x)
x
y

Out:

> x
[1] "But"        "when"       "i"          "have"       "seen"       "you"        "eventually"
> y
[1] But        when       i          have       seen       you        eventually

格式化數字

1.formatC函數可以爲數字指定固定型或科學型的格式、小數的位數以及輸出的寬度,使用該函數輸出是character字符向量或數組。默認保留四位有效數字。
format參數可設置爲科學格式,digits參數可以指定有效數字的位數,width參數官方解釋爲 the total field width。

pow <- 1:3
powers_of_e <- exp(pow)

formatC(powers_of_e)
formatC(powers_of_e,format="e")
formatC(powers_of_e,digits=3,width=10)

Out:

[1] "2.718" "7.389" "20.09"
[1] "2.7183e+00" "7.3891e+00" "2.0086e+01"
[1] "      2.72" "      7.39" "      20.1"

2.sprintf函數的用法和C語言中的printf很相似。
%s代表字符串,%f代表固定型格式的浮點數,%e代表科學型格式的浮點數,%d代表整數。

sprintf("To three decimal places, e ^ %d = %.3f", pow, powers_of_e)

Out:

[1] "To three decimal places, e ^ 1 = 2.718"  
[2] "To three decimal places, e ^ 2 = 7.389" 
[3] "To three decimal places, e ^ 3 = 20.086"

3.format提供的格式化字符串的語法和formatC的用法基本類似。
digits參數表示保留的有效數字個數,scientific參數決定是否用科學記數法,trim參數爲TRUE時,會去掉多餘的0.

format(powers_of_e, digits=3, scientific=TRUE,trim=TRUE)

Out:

[1] "2.72e+00" "7.39e+00" "2.01e+01"

更改大小寫

toupper("you were even more divine and adorable than i fancied")

tolower("YOU CANNOT SAY I AM LING ,FOR IF IT IS NOT TRUE")

Out:

[1] "YOU WERE EVEN MORE DIVINE AND ADORABLE THAN I FANCIED"
[1] "you cannot say i am ling ,for if it is not true"

截取字符串

substring和substr函數可以從字符串中截取子串,不同之處在於,前者輸出的長度和最長的輸入一樣,對後者來說,輸出的長度只與第一個輸入的相等。(第二個向量參數中的元素和第三個參數搭配着來截取的)

poem_sen <- c(
  "I will be content with merely mising you",
  "rather than die to see you so",
  "Don't worry about aging",
  "For you must be dazzling",
  "even when you are greying"
)
substring(poem_sen, 1:6, 10)
substr(poem_sen, 1:6, 10)

Out:

[1] "I will be " "ather tha"  "n't worr"   " you mu"    " when "     "l be "
[1] "I will be " "ather tha"  "n't worr"   " you mu"    " when " 

分割字符串

strsplit函數可以在某些指定的點上分割字符串,將字符串按照第二個參數分開,返回的是列表。

strsplit(poem_sen, " ", fixed="TRUE")#按照空格分開

Out:

[[1]]
[1] "I"       "will"    "be"      "content" "with"    "merely"  "mising"  "you"    

[[2]]
[1] "rather" "than"   "die"    "to"     "see"    "you"    "so"    

[[3]]
[1] "Don't" "worry" "about" "aging"

[[4]]
[1] "For"      "you"      "must"     "be"       "dazzling"

[[5]]
[1] "even"    "when"    "you"     "are"     "greying"

我們也可以使用正則表達式來分割字符串。

strsplit(poem_sen, "[A-Z]")

Out:

[[1]]
[1] ""                                        " will be content with merely mising you"

[[2]]
[1] "rather than die to see you so"

[[3]]
[1] ""                       "on't worry about aging"

[[4]]
[1] ""                        "or you must be dazzling"

[[5]]
[1] "even when you are greying"

文本路徑

路徑分爲絕對路徑和相對路徑,在相對路徑中,.用於當前目錄,而…用於父目錄,~代表當前用戶主目錄。
path.expand可以將相對路徑轉爲絕對路徑。

path.expand(".")
path.expand("..")
path.expand("~")

Out:

[1] "."
[1] ".."
[1] "C:/Users/Beryl/Documents"

basename函數只返回文件名,dirname只返回文件目錄

file <- "E:/Ksoftware/Rstudio/R/modules/ModuleTools.R"
basename(file)
dirname(file)

Out:

[1] "ModuleTools.R"
[1] "E:/Ksoftware/Rstudio/R/modules"
getwd()#查看R中文件被讀寫的地方
setwd("E:/Data/Rstudio")#更改位置
file.path("E:","Data","R")#可自動在目錄名稱之間插入正斜槓
R.home()#R的安裝位置

Out:

[1] "C:/Users/Beryl/Documents"
[1] "E:/Data/R"
[1] "E:/KSOFTW~1/R/R-36~1.1"
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章