Guest:
Register
|
Login
|
Member List
|
Statistics
|
FAQ
Minidx Support Forum
»
中文用户
» MAX_EXTRACT_TEXT_SIZE上限?
‹‹ Last Thread
|
Next Thread ››
Poll
Trade
Reward
Activity
Printable Version
|
Email to Friend
|
Subscription
|
Favorites
Subject: MAX_EXTRACT_TEXT_SIZE上限?
heroyo
Newbie
UID 110
Digest Posts 0
Credits 0
Posts 4
Reading Access 10
Registered 19-8-2008
Status Offline
#1
Post at 19-8-2008 21:37
Profile
|
P.M.
MAX_EXTRACT_TEXT_SIZE上限?
Dear All~
請教大家關於Extract text Demo中宣告的MAX_EXTRACT_TEXT_SIZE變數,
截取本文用64M作為截取的上限,是否有特殊的意義呢?
例:64M是不是經過測試,所訂立出來,效能上是最洽當的上限大小呢?
是不是測試截取200種檔案格式之後,所訂立的截取上限呢?
或是沒有特別的意思呢?
謝謝!
' no more than 64MB of raw text for a resume!
Private Const MAX_EXTRACT_TEXT_SIZE As Integer = 64 * 1024 * 1024
Best Regards,
Hiro
[Adv.]
dingzhigang
Administrator
UID 2
Digest Posts 0
Credits 40
Posts 74
Reading Access 200
Registered 27-3-2007
Status Offline
#2
Post at 20-8-2008 10:47
Profile
|
Blog
|
P.M.
QUOTE:
Originally posted by
heroyo
at 19-8-2008 21:37
Dear All~
請教大家關於Extract text Demo中宣告的MAX_EXTRACT_TEXT_SIZE變數,
截取本文用64M作為截取的上限,是否有特殊的意義呢?
例:64M是不是經過測試,所訂立出來,效能上是最洽當的上限大小呢?
是不是測試截取200種檔案 ...
heroyo,你好
定义64是由于受到Win系统本身的限制,实际上对超过这个数字的文本进行索引的意义已经不大了。
当然你也可以对这个数字进行修改,打开注册表,
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Contro l\ContentIndex
MaxTextFilterBytes,默认是25,000,000
[Adv.]
heroyo
Newbie
UID 110
Digest Posts 0
Credits 0
Posts 4
Reading Access 10
Registered 19-8-2008
Status Offline
#3
Post at 20-8-2008 12:15
Profile
|
P.M.
謝謝您!!
另外請教一個問題,
請問ExtractText.dll元件是如何產出的呢?
該元件是否為Microsoft釋出的元件呢?
謝謝!
Best Regards,
Hiro
[Adv.]
dingzhigang
Administrator
UID 2
Digest Posts 0
Credits 40
Posts 74
Reading Access 200
Registered 27-3-2007
Status Offline
#4
Post at 20-8-2008 12:18
Profile
|
Blog
|
P.M.
QUOTE:
Originally posted by
heroyo
at 20-8-2008 12:15
謝謝您!!
另外請教一個問題,
請問ExtractText.dll元件是如何產出的呢?
該元件是否為Microsoft釋出的元件呢?
謝謝!
Best Regards,
Hiro
下面又说明的:
http://blog.minidx.com/2007/12/31/334.html
参考开头的说明就可以了。
[Adv.]
heroyo
Newbie
UID 110
Digest Posts 0
Credits 0
Posts 4
Reading Access 10
Registered 19-8-2008
Status Offline
#5
Post at 20-8-2008 12:54
Profile
|
P.M.
Dear dingzhigang,
謝謝您的回覆,
我在公司的產品中,為了實現Lucene全文檢索的功能,
有參考ExtractText.dll這顆元件(VB.NET 2005開發),
已初步完成截取txt,doc,ppt,xls,docx,pptx,xlsx附件格式的內文,
但是上級安全性考量,希望不要引用不明來歷的元件。
請問ExtractText.dll這顆元件,是否有Open Source呢?
該元件又是用哪一種程式語言開發呢?
謝謝您!
Best Regards,
Hiro
[Adv.]
dingzhigang
Administrator
UID 2
Digest Posts 0
Credits 40
Posts 74
Reading Access 200
Registered 27-3-2007
Status Offline
#6
Post at 20-8-2008 14:36
Profile
|
Blog
|
P.M.
该组件目前没有打算开源,抱歉
是用C++实现的.
[Adv.]
heroyo
Newbie
UID 110
Digest Posts 0
Credits 0
Posts 4
Reading Access 10
Registered 19-8-2008
Status Offline
#7
Post at 20-8-2008 15:09
Profile
|
P.M.
瞭解, 我自己會再深入研究IFilter,
希望能夠用vb.net寫出截取附件內容的功能,
還是很謝謝您!!
[Adv.]
Poll
Trade
Reward
Activity
Minidx Support Forum
Minidx
> English User
> 日本語ユーザ
> 中文用户
All times are GMT+8, the time now is 20-11-2008 18:07
Processed in 0.736652 second(s), 7 queries , Gzip enabled
TOP
Clear Cookies
-
Contact Us
-
Minidx Inc
-
Archiver
-
WAP
Member's CP Home
Edit Profile
Credits Transaction
Public User Groups
Buddy List
Main
Page Views
User Agents
Posts History
Top Forums
Top Threads
Post Ranking
Credit Ranking
Online Time
Team