File size: 2,432 Bytes
f6f0c71
 
 
 
 
acf17aa
 
 
 
 
 
f6f0c71
 
acf17aa
 
 
 
 
f6f0c71
 
 
 
 
 
 
 
8c0e214
f6f0c71
 
 
8c0e214
f6f0c71
 
 
 
 
acf17aa
 
 
f6f0c71
 
 
acf17aa
 
 
f6f0c71
 
 
acf17aa
 
 
 
f6f0c71
 
acf17aa
f6f0c71
0418ab0
f6f0c71
 
acf17aa
 
 
 
 
f6f0c71
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
# Scribd-dl ![nodedotjs](https://img.shields.io/badge/node.js-v21.6.1-339933.svg?style=flat&logo=nodedotjs&logoColor=white) ![npm](https://img.shields.io/badge/npm-10.2.4-dc2c35.svg?style=flat&logo=npm&logoColor=white)
[![License: GPLv3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)

## About ##
Scribd-dl helps downloading documents on [scribd.com](https://www.scribd.com/) without membership / sign-in.  
2 modes are available:  
- default:          the .pdf file is generated by chromium's print function  
- image-based       the .pdf file is generated by image snapshots taken for pages  

It is prefer to use the `default` mode as it gives a better performance in generation time and file size.  
`image-based` mode is a backup solution in case the `default` mode doesn't work as expected.

Friendly reminder:  
1. The .pdf generated by `image-based` mode is formed by images, so it does NOT contain any text.  

## Development Plan ##
Scribd obfuscates the .pdf files, the texts copied from the documents will become strange garbled message.  
De-obfuscating will be the next stage.

## Prerequisites ##
Please make sure the following tool(s) / application(s) are properly setup and ready to use:
- Node.js ([https://nodejs.org/](https://nodejs.org/))

## Setup ##
1. Download repository  
```console
git clone https://github.com/rkwyu/scribd-dl
```
2. Install dependencies
```console
cd ./scribd-dl
npm install
```

## Configuration ##
```ini
[SCRIBD]
rendertime=100

[DIRECTORY]
output=output
```
Configuration can be altered in `config.ini`.  
`rendertime` is the waiting time in millisecond for single page rendering, it is only applicable for `default` mode. (too short might cause missing images)  
`output` is the ouput directory for generated .pdf files.

## Usage (CLI) ##
```console
Usage: npm start [options] url
Options:  
  /d            default: generated by chromium's print function
  /i        image-based: generated by image snapshots taken for pages
```

#### Example 1: Download 《The Minds of Billy Milligan》 ####
```console
npm start https://www.scribd.com/doc/249398282/The-Minds-of-Billy-Milligan-Daniel-Keyes
```

#### Example 2: Download 《The Minds of Billy Milligan》 using image-based method####
```console
npm start /i https://www.scribd.com/doc/249398282/The-Minds-of-Billy-Milligan-Daniel-Keyes
```

## License ##
[GNU GPL v3.0](LICENSE.md)