B.10 PhantomJS 网页截图

Winston Chang 开发了 webshot 包网页截图,它依赖 PhantomJS,所以首先需要安装

install.packages("webshot")
webshot::install_phantomjs()

以截取网页 https://www.r-project.org/ 为例,

library(webshot)
webshot("https://www.r-project.org/", "r.png")
webshot("https://www.r-project.org/", "r.pdf") # Can also output to PDF

还可以截取 R Markdown 文档内容,注意是先编译 R Markdown 文档为 HTML 文档,然后截取网页

rmdshot(system.file("examples/knitr-minimal.Rmd", package = "knitr"), file = "screenshots/knitr-minimal.png")

裁剪出特定大小的图片,需要额外的系统依赖 GraphicsMagick (recommended) or ImageMagick installed

# Can specify pixel dimensions for resize()
webshot("https://www.r-project.org/", "r-small.png") %>%
  resize("400x") %>%
  shrink()
** Processing: r-small.png
400x442 pixels, 4x8 bits/pixel, RGB+alpha
Reducing image to 3x8 bits/pixel, RGB
Input IDAT size = 70570 bytes
Input file size = 70867 bytes

Trying:
  zc = 9  zm = 8  zs = 0  f = 0         IDAT size = 59441
  zc = 9  zm = 8  zs = 1  f = 0
  zc = 1  zm = 8  zs = 2  f = 0
  zc = 9  zm = 8  zs = 3  f = 0
  zc = 9  zm = 8  zs = 0  f = 5
  zc = 9  zm = 8  zs = 1  f = 5
  zc = 1  zm = 8  zs = 2  f = 5
  zc = 9  zm = 8  zs = 3  f = 5
                               
Selecting parameters:
  zc = 9  zm = 8  zs = 0  f = 0         IDAT size = 59441

Output IDAT size = 59441 bytes (11129 bytes decrease)
Output file size = 59714 bytes (11153 bytes = 15.74% decrease)