資源簡介
對大數據文本文件讀取(按行讀取)的優化,目前常規的方案有三種,第一種LineNumberReader,第二種RandomAccessFile,第三種是內存映射文件在RandomAccessFile基礎上調用getChannel().map(...);代碼提供在RandomAccessFile基礎上,整合內部緩沖區,效率會有提高,測試過程中1000w行數據用時1秒,1億行數據用時103(比1438秒快了13倍左右)

代碼片段和文件信息
package?com.gqshao.file.io;
import?java.io.File;
import?java.io.FileNotFoundException;
import?java.io.IOException;
import?java.io.RandomAccessFile;
import?java.util.Arrays;
public?class?BufferedRandomAccessFile?extends?RandomAccessFile?{
????static?final?int?LogBuffSz_?=?16;?//?64K?buffer
????public?static?final?int?BuffSz_?=?(1?<????static?final?long?BuffMask_?=?~(((long)?BuffSz_)?-?1L);
????private?String?path_;
????/*
?????*?This?implementation?is?based?on?the?buffer?implementation?in?Modula-3‘s
?????*?“Rd“?“Wr“?“RdClass“?and?“WrClass“?interfaces.
?????*/
????private?boolean?dirty_;?//?true?iff?unflushed?bytes?exist
????private?boolean?syncNeeded_;?//?dirty_?can?be?cleared?by?e.g.?seek?so?track?sync?separately
????private?long?curr_;?//?current?position?in?file
????private?long?lo_?hi_;?//?bounds?on?characters?in?“buff“
????private?byte[]?buff_;?//?local?buffer
????private?long?maxHi_;?//?this.lo?+?this.buff.length
????private?boolean?hitEOF_;?//?buffer?contains?last?file?block?
????private?long?diskPos_;?//?disk?position
??????/*
??????*?To?describe?the?above?fields?we?introduce?the?following?abstractions?for
??????*?the?file?“f“:
??????*
??????*?len(f)?the?length?of?the?file?curr(f)?the?current?position?in?the?file
??????*?c(f)?the?abstract?contents?of?the?file?disk(f)?the?contents?of?f‘s
??????*?backing?disk?file?closed(f)?true?iff?the?file?is?closed
??????*
??????*?“curr(f)“?is?an?index?in?the?closed?interval?[0?len(f)].?“c(f)“?is?a
??????*?character?sequence?of?length?“len(f)“.?“c(f)“?and?“disk(f)“?may?differ?if
??????*?“c(f)“?contains?unflushed?writes?not?reflected?in?“disk(f)“.?The?flush
??????*?operation?has?the?effect?of?making?“disk(f)“?identical?to?“c(f)“.
??????*
??????*?A?file?is?said?to?be?*valid*?if?the?following?conditions?hold:
??????*
??????*?V1.?The?“closed“?and?“curr“?fields?are?correct:
??????*
??????*?f.closed?==?closed(f)?f.curr?==?curr(f)
??????*
??????*?V2.?The?current?position?is?either?contained?in?the?buffer?or?just?past
??????*?the?buffer:
??????*
??????*?f.lo?<=?f.curr?<=?f.hi
??????*
??????*?V3.?Any?(possibly)?unflushed?characters?are?stored?in?“f.buff“:
??????*
??????*?(forall?i?in?[f.lo?f.curr):?c(f)[i]?==?f.buff[i?-?f.lo])
??????*
??????*?V4.?For?all?characters?not?covered?by?V3?c(f)?and?disk(f)?agree:
??????*
??????*?(forall?i?in?[f.lo?len(f)):?i?not?in?[f.lo?f.curr)?=>?c(f)[i]?==
??????*?disk(f)[i])
??????*
??????*?V5.?“f.dirty“?is?true?iff?the?buffer?contains?bytes?that?should?be
??????*?flushed?to?the?file;?by?V3?and?V4?only?part?of?the?buffer?can?be?dirty.
??????*
??????*?f.dirty?==?(exists?i?in?[f.lo?f.curr):?c(f)[i]?!=?f.buff[i?-?f.lo])
??????*
??????*?V6.?this.maxHi?==?this.lo?+?this.buff.length
??????*
??????*?Note?that?“f.buff“?can?be?“null“?in?a?valid?file?since?the?range?of
??????*?characters?in?V3?is?empty?when?“f.lo?==?f.curr“.
??????*
??????*?A?file?is?said?to?be?*ready*?if?the?buffer?contai
?屬性????????????大小?????日期????時間???名稱
-----------?---------??----------?-----??----
?????文件????????3915??2016-01-17?20:31??FileUtil.java
?????文件???????11624??2016-01-17?20:29??BufferedRandomAccessFile.java
- 上一篇:serializer.jar
- 下一篇:android 藍牙SPP傳輸demo
評論
共有 條評論