Saturday, February 27, 2016

Fuzzy Hashing

While researching, a necessity to find out if two web documents are the same has arise. For instance, i want to know if this URL http://www.eutrigtreat.eu points to the same content has http://www.eutrigtreat.com (this example is relatively easy but you can find completely different URLs pointing to the same web content).
At first sight, using a normal md5 hashing algorithm on the websites content seems to solve the problem. But web sites are not static, and much of them can have dynamic content being generated, like displaying the current date time or a entire html section, or a simple extra blank space. Hashing one website content now, and hashing the same website content one second later will result in a completely different hash and a misleading conclusion that the two websites are not the same.
Fortunately, there is a better hashing method that will help to work around this problem and also unlock useful information about that two websites URLs.
Content Triggered Piecewise Hashing (CTPH) is the solution. The basic idea of the algorithm is that the hashing is made in blocks of bits, so a change in a document will only affect that block and the hash output will not be completely different.

You can have more details about fuzzy hashing and use it at ssdeep.

Since i needed to use with python, there is a wrapper around ssdeep at https://pypi.python.org/pypi/ssdeep.

Fuzzy hashing example:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
import ssdeep
hash1 = ssdeep.hash('Also called fuzzy hashes, Ctph can match inputs that have homologies.')
hash1
'3:AXGBicFlgVNhBGcL6wCrFQEv:AXGHsNhxLsr2C'
hash2 = ssdeep.hash('Also called fuzzy hashes, CTPH can match inputs that have homologies.')

# comparing
ssdeep.compare(hash1, hash2)
22
hash2
'3:AXGBicFlIHBGcL6wCrFQEv:AXGH6xLsr2C'
-
With the problem stated at the beginning, the results of the hashing would be a fuzzy hashing matching score of 91%. With a high score like this, at the context of web documents we can say with a good degree of confidence that the websites point to the same content.


Sunday, July 29, 2012

Outlook - Hidden Folder !?



Last week a Outlook problem come up. A misterious folder disappear!!! No one could find it, but when we tried to create a new one, with the same name, a error message was received saying that folder already exist.

Googling about the problem, i come up with a excelent tool to solve Outlook/Exchange related problems.

MFCMAPI

So how to use it to resolve this problem?

Just run it, in Session Menu press logon and then choose the right Outlook Profile.

Choose the desired MAPI Store, search for the IPM_SUBTREE and expand it.
Then you will see all availabel folders, including the hidden one!!




Select the hidden folder, and search for a Property with the name PR_ATTR_HIDDEN.

Desactivate the checkbox Boolean and voilá, the folder is visible again.







Monday, May 14, 2012

Powershell - Proxy Functions


Last article, I made a powershell function to calculate folders size. Wrapping a Get-ChildItem cmdlet and adding a ScriptProperty that calculates the foldersize to the output of Get-ChildItem. But it has a major drawback, it only supports the default behavior of Get-ChildItem.  I couldn´t do nothing like Get-Childitem -Force because the function only execute Get-ChildItem without any switch. The best way to have the same behavior of Get-ChildItem but with extended functionality is to make a Proxy Command.  You can use the MetaProgramming Module by Jeffrey Snover to simplify the task. 


PS> Import-Module MetaProgramming
PS> New-ProxyCommand Get-ChildItem -AddParameter FolderSize > Get-ExtendedChildItem.ps1 . 


Add the Switch tag in Param section:

[Switch] ${FolderSize}


And add this piece of code in begin section:


begin
{
    try {
        $outBuffer = $null
        if ($PSBoundParameters.TryGetValue('OutBuffer', [ref]$outBuffer))
        {
            $PSBoundParameters['OutBuffer'] = 1
        }
        $wrappedCmd = $ExecutionContext.InvokeCommand.GetCommand('Get-ChildItem', [System.Management.Automation.CommandTypes]::Cmdlet)
        if ($FolderSize)
        {
            [Void]$PSBoundParameters.Remove("FolderSize")
$scriptCmd = {& $wrappedCmd @PSBoundParameters |
ForEach-Object {
if ($_.PSIsContainer){
Add-Member -InputObject $_ -MemberType ScriptProperty -Name Length -Value `
{
$size = 0
Get-ChildItem -Recurse $this.FullName | Where-Object {!$_.PSIsContainer} |
ForEach-Object {$size += $_.Length}
$size
} -PassThru
}
else {
$_
}
}
        }
}
else {$scriptCmd = {& $wrappedCmd @PSBoundParameters }}
        $steppablePipeline = $scriptCmd.GetSteppablePipeline($myInvocation.CommandOrigin)
        $steppablePipeline.Begin($PSCmdlet)
    } catch {
        throw
    }
}


Wrap the code with Function Get-ExtendedChildItem{CODE} load it and  its done.
(You can also rename it Get-ChildItem, Powershell will search for the function first) 


Example:









Tuesday, May 8, 2012

Powershell - Getting Folders Size

How many times did you just wish to not only list some files but also have the file's folder's size be displayed?
Today i setup a little script to give a hand at this.

The output is from the Get-ChildItem cmdlet (dir) but extending the System.IO.DirectoryInfo
Objects output with a ScriptProperty that display the folder size.

It will do the job for now, but it just can use the default behavior of  Get-ChildItem.

;)

Function Get-ExtendedChildItem {

[cmdletbinding()]

Param(
[Parameter(Mandatory=$false)]
[string]$FolderPath
)

if($FolderPath)
{$d = Get-ChildItem $FolderPath}
else
{$d = Get-ChildItem}

$d | Where-Object {$_.PSIsContainer} |
ForEach-Object{
Add-Member -InputObject $_ -MemberType ScriptProperty -Name Length -Value `
{#Get
$size = 0
Get-ChildItem -Recurse $this.FullName | Where-Object {!$_.PSIsContainer} |
ForEach-Object {$size += $_.Length}
$size
}
}
$d
}

Thursday, May 3, 2012

Setting up a Test Enviroment Network



One month ago, a Cisco Switch Catalyst 3750 that was in production decided to stop working, it simply didn't boot.
Days Later i decided to try boot it again, and guess what? it booted :P (Probably Power Supply problems), An equipment with clear problems, but still adequate to be used on a testing environment.
I managed to get an inactive Router Cisco 2600 hence i have all that i need to create my own small test enviroment network.


The goal is to create private network using the router as gateway, like this


Your Company Network (Outside)----> Router (inside)----> Your Test Network


The out-interface will be configured with a internal network IP Address and then the root will do NAT to enable internet access in Test Network.
The Cisco Switch is already configured with production configurations, since i want it to keep them i will try to not change anything from the Switch (Not a big deal, just have to keep in mind that ports are in VLAN 20 and search for a trunking mode port to connect the router)


Equipment used:
Switch Cisco Catalyst 3750
Router Cisco 2600 Series


Reseting Cisco 2600 Series


First lets wipe out router configurations!
Check the configuration register if it is 0x2102.


router#configure terminal
router(config)#config-register 0x2102
router(config)#end


then erase startup-config


router#erase startup-config


Now reload the router


router#reload


It will prompt you by Configuration Dialog, its up to you can configure some stuff with the Dialog.
Now we have factory configurations up.


Configuring interfaces:


First lets configure the interfaces.


Outside interface:


router#configure terminal
router(config)# interface fastethernet0/1
router(config)#ip address 192.168.10.49 255.255.255.0
router(config)#ip nat outside
router(config)#no shutdown
router(config)#exit


router(config)# interface fastethernet0/0
router(config)# encapsulation dot1Q 20 (Vlan 20 encapsulation)
router(config)# no shutdown
router(config)# exit


router(config)# interface fastethernet0/0.1
router(config)# ip address 172.16.0.1 255.255.0
router(config)#ip nat inside
router(config)# no shutdown


Configuring DHCP:


I will user a 172.16.0.0/16 network, DHCP is not necessary, but since i dont want to configure Computers manually...


router(config)# ip dhcp pool 172.16.0.0/16
router(dhcp-config)# network 172.16.0.0 255.255.0.0
router(dhcp-config)# ip dhcp pool 172.16.0.1 255.255.0.0
router(dhcp-config)# default router 172.16.0.1
router(dhcp-config)# dns-server 192.168.0.42 (your dns server or a public one)
router(dhcp-config)# exit
router(config)# ip dhcp excluded-address 172.16.0.1 172.16.0.255


Routing the traffic to Outside (Internet):


Setting up the default settings and default route, so traffic to internet will be redirected to outside interface.


router(config)# ip default-network 172.16.0.0
router(config)# ip default-gateway 192.168.10.1
router(config)# ip route 0.0.0.0 0.0.0.0 FastEthernet0/1


NAT:


NAT is essential so we can map the internal IP's to the Outside IP (192.168.10.49)


router(config)#ip nat inside source list 1 interface fastethernet0/0.1 overload
router(config)#access-list 1 permit 172.16.0.0. 0.0.255.255


 it's done.
Now just mess around :D

Tuesday, May 1, 2012

Moore's Law End?

Moore's Law says that the number of transistors that can be placed on an integrated circuit doubles approximately every two years, as we can see in this logarithmic graph:





















In this nice video Michio Kaku talks about the Moore's Law End and the Post-Silicon Era.
Hope you enjoy ;)

Monday, April 30, 2012

Dropbox LAN Sync





I was slacking around, when i randomly clicked in Dropbox Cliente Preferences and have seen a interesting propertie (or not):



What is that? LAN sync enables DropBox Clients to sync files from a computer in the local area network, instead syncing them from internet if available (speeding up the process).

Lets analyze whats is happening with a packet monitor:


As we can see it sends UDP broadcast packet about every 30 seconds to detect Local Network Machines.
If it find the files to sync in the local network, Dropbox then use TLS to transfer the files between Local Computers.

Altough is a very cool feature to speed up the syncing process in your home network between friends, or your own computers, in a Campus Wireless Network or public wireless network it is just more overhead being added to the network.

Just make me favor, disable it when you dont need :)